-
-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Description
Your current environment
I'm using vLLM-Docker latest (0.6.2)
Model Input Dumps
No response
🐛 Describe the bug
INFO 10-10 00:56:44 api_server.py:164] Multiprocessing frontend to use ipc:///tmp/6f288ab9-add1-4cfb-a217-af1687e882b5 for IPC Path.
qwen72-1 | INFO 10-10 00:56:44 api_server.py:177] Started engine process with PID 36
qwen72-1 | Unrecognized keys in rope_scaling
for 'rope_type'='default': {'mrope_section'}
qwen72-1 | Traceback (most recent call last):
qwen72-1 | File "", line 198, in _run_module_as_main
qwen72-1 | File "", line 88, in _run_code
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 571, in
qwen72-1 | uvloop.run(run_server(args))
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 109, in run
qwen72-1 | return __asyncio.run(
qwen72-1 | ^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
qwen72-1 | return runner.run(main)
qwen72-1 | ^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
qwen72-1 | return self._loop.run_until_complete(task)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "uvloop/loop.pyx", line 1517, in uvloop.loop.Loop.run_until_complete
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/uvloop/init.py", line 61, in wrapper
qwen72-1 | return await main
qwen72-1 | ^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 538, in run_server
qwen72-1 | async with build_async_engine_client(args) as engine_client:
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 105, in build_async_engine_client
qwen72-1 | async with build_async_engine_client_from_engine_args(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/lib/python3.12/contextlib.py", line 210, in aenter
qwen72-1 | return await anext(self.gen)
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py", line 182, in build_async_engine_client_from_engine_args
qwen72-1 | engine_config = engine_args.create_engine_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 874, in create_engine_config
qwen72-1 | model_config = self.create_model_config()
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/engine/arg_utils.py", line 811, in create_model_config
qwen72-1 | return ModelConfig(
qwen72-1 | ^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 207, in init
qwen72-1 | self.max_model_len = _get_and_verify_max_len(
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/config.py", line 1746, in _get_and_verify_max_len
qwen72-1 | assert "factor" in rope_scaling
qwen72-1 | ^^^^^^^^^^^^^^^^^^^^^^^^
qwen72-1 | AssertionError
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.