-
Notifications
You must be signed in to change notification settings - Fork 757
Closed as not planned
Description
System Info / 系統信息
python: 3.11.0
vllm: 0.8.4
cuda: 11.6
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
- docker / docker
- pip install / 通过 pip install 安装
- installation from source / 从源码安装
Version info / 版本信息
xinference:1.5.0.post2
The command used to start Xinference / 用以启动 xinference 的命令
xinference-local -H 0.0.0.0 -p 8888 --auth-config xxx
Reproduction / 复现过程
- 启动xinference
- Launch InternVL3-14B,以本地文件的方式,engine选用vllm
- 使用xinference client或者openai方式请求均报错,按照官方文档使用llm或者vl model格式都不行,请求代码如下
from xinference.client import Client
client = Client('http://localhost:8888', api_key='')
model = client.get_model('InternVL3-14B')
messages = [{'role': 'user', 'content': 'What is the largest animal?'}]
# messages = [{"role": "user", "content": [{'type': 'text', 'text': "What is the largest animal?"}]}]
response = model.chat(
messages=messages
)
print(response)完整错误日志:
Traceback (most recent call last):
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/api/restful_api.py", line 2128, in create_chat_completion
data = await model.chat(
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 262, in send
return self._process_result_message(result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/context.py", line 111, in _process_result_message
raise message.as_instanceof_cause()
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 689, in send
result = await self._run_coro(message.message_id, coro)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/backends/pool.py", line 389, in _run_coro
return await coro
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
return await super().__on_receive__(message) # type: ignore
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 564, in __on_receive__
raise ex
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
async with self._lock:
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.__on_receive__
with debug_async_timeout('actor_lock_timeout',
^^^^^^^^^^^^^^^^^
File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.__on_receive__
result = await result
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 106, in wrapped_func
ret = await fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xoscar/api.py", line 462, in _wrapper
r = await func(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/utils.py", line 93, in wrapped
ret = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 855, in chat
response = await self._call_wrapper_json(
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 662, in _call_wrapper_json
return await self._call_wrapper("json", fn, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 141, in _async_wrapper
return await fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/core/model.py", line 672, in _call_wrapper
ret = await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/vllm/utils.py", line 30, in _async_wrapper
return await fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/vllm/core.py", line 1143, in async_chat
prompt = self.get_full_context(
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/utils.py", line 155, in get_full_context
return self._build_from_raw_template(messages, chat_template, **kwargs)
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/xinference/model/llm/utils.py", line 120, in _build_from_raw_template
rendered = compiled_template.render(
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/jinja2/environment.py", line 1295, in render
self.environment.handle_exception()
^^^^^^^^^^^^^^^^^
File "/data/anaconda3/envs/xinference/lib/python3.11/site-packages/jinja2/environment.py", line 942, in handle_exception
raise rewrite_traceback_stack(source=source)
^^^^^^^^^^^^^^^^^
File "<template>", line 23, in top-level template code
TypeError: [address=0.0.0.0:36647, pid=5507] can only concatenate str (not "list") to str经测试,Qwen2.5-7B-instruct用相同方式请求可以得到正确的返回
Expected behavior / 期待表现
得到正确的返回