-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
bugConfirmed bugsConfirmed bugs
Description
🐛 Bug
Unicode/UTF-8 Character Handling Issue in REST API "/v1/chat/completions" Endpoint
To Reproduce
Steps to reproduce the behavior:
- Spawn a REST API server with a model that supports outputting CJK characters or emoji.
> curl --data '{"model":"","messages":[{"role":"user","content":"Introduce yourself with lots of emojis"}],"stream":true}' --header 'Content-Type: application/json' http://127.0.0.1:8000/v1/chat/completions
- output:
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello!"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" "},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" just"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" an"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" A"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" assistant"},"finish_reason":"stop"}]}
...
Expected behavior
Expected output:
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello!"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" "},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"😊"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"👋"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" just"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" an"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" A"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"finish_reason":"stop"}]}
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" assistant"},"finish_reason":"stop"}]}
...
Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Metal
- Operating system (e.g. Ubuntu/Windows/MacOS/...): macOS
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...)
- How you installed MLC-LLM (
conda
, source): source - How you installed TVM-Unity (
pip
, source): pip - Python version (e.g. 3.10): 3.11
- GPU driver version (if applicable):
- CUDA/cuDNN version (if applicable):
- TVM Unity Hash Tag (
python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"
, applicable if you compile models): - Any other relevant information:
Potential fix
mlc-llm/python/mlc_chat/rest.py
Lines 167 to 181 in 7c135b8
prev_txt = "" | |
async for content in AsyncChatCompletionStream(): | |
if content: | |
chunk = ChatCompletionStreamResponse( | |
choices=[ | |
ChatCompletionResponseStreamChoice( | |
index=0, | |
delta=DeltaMessage( | |
role="assistant", content=content[len(prev_txt) :] | |
), | |
finish_reason="stop", | |
) | |
] | |
) | |
prev_txt = content |
prev_txt = ""
async for content in AsyncChatCompletionStream():
if content:
valid_content = content.replace('�', '')
chunk = ChatCompletionStreamResponse(
choices=[
ChatCompletionResponseStreamChoice(
index=0,
delta=DeltaMessage(
role="assistant", content=valid_content[len(prev_txt):]
),
finish_reason="stop",
)
]
)
prev_txt = valid_content
YuchenJin
Metadata
Metadata
Assignees
Labels
bugConfirmed bugsConfirmed bugs