Skip to content

[Bug]: AssertionError When deploy API serve of Qwen2-VL-72B in Docker #9236

@FBR65

Description

@FBR65

Your current environment

I'm using vLLM-Docker latest (0.6.2)

Model Input Dumps

No response

🐛 Describe the bug

INFO 10-10 00:56:44 api_server.py:164] Multiprocessing frontend to use ipc:///tmp/6f288ab9-add1-4cfb-a217-af1687e882b5 for IPC Path.
qwen72-1 | INFO 10-10 00:56:44 api_server.py:177] Started engine process with PID 36
qwen72-1 | Unrecognized keys in rope_scaling for 'rope_type'='default': {'mrope_section'}
qwen72-1 | Traceback (most recent call last):
qwen72-1 | File "", line 198, in _run_module_as_main
qwen72-1 | File "", line 88, in _run_code
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 571, in
qwen72-1 | uvloop.run(run_server(args))
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 109, in run
qwen72-1 | return __asyncio.run(
qwen72-1 | ^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
qwen72-1 | return runner.run(main)
qwen72-1 | ^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
qwen72-1 | return self._loop.run_until_complete(task)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 61, in wrapper
qwen72-1 | return await main
qwen72-1 | ^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 538, in run_server
qwen72-1 | async with build_async_engine_client(args) as engine_client:
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 105, in build_async_engine_client
qwen72-1 | async with build_async_engine_client_from_engine_args(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 182, in build_async_engine_client_from_engine_args
qwen72-1 | engine_config = engine_args.create_engine_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 874, in create_engine_config
qwen72-1 | model_config = self.create_model_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 811, in create_model_config
qwen72-1 | return ModelConfig(
qwen72-1 | ^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 207, in init
qwen72-1 | self.max_model_len = _get_and_verify_max_len(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 1746, in _get_and_verify_max_len
qwen72-1 | assert "factor" in rope_scaling
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | AssertionError

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions