Unicode/UTF-8 Character Handling Issue in REST API "/v1/chat/completions" Endpoint

## 🐛 Bug

Unicode/UTF-8 Character Handling Issue in REST API "/v1/chat/completions" Endpoint

## To Reproduce

Steps to reproduce the behavior:

1. Spawn a REST API server with a model that supports outputting CJK characters or emoji.
2. 
```bash
> curl --data '{"model":"","messages":[{"role":"user","content":"Introduce yourself with lots of emojis"}],"stream":true}' --header 'Content-Type: application/json' http://127.0.0.1:8000/v1/chat/completions
```
3. output:
```
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello!"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" "},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"�"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" just"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" an"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" A"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" assistant"},"finish_reason":"stop"}]}

...
```

## Expected behavior

Expected output:
```
data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello!"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" "},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"😊"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"👋"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" I"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"'"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"m"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" just"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" an"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" A"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"finish_reason":"stop"}]}

data: {"choices":[{"index":0,"delta":{"role":"assistant","content":" assistant"},"finish_reason":"stop"}]}

...
```

## Environment

 - Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Metal
 - Operating system (e.g. Ubuntu/Windows/MacOS/...): macOS
 - Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...)
 - How you installed MLC-LLM (`conda`, source): source 
 - How you installed TVM-Unity (`pip`, source): pip
 - Python version (e.g. 3.10): 3.11
 - GPU driver version (if applicable):
 - CUDA/cuDNN version (if applicable):
 - TVM Unity Hash Tag (`python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))"`, applicable if you compile models):
 - Any other relevant information:

## Potential fix

https://github.com/mlc-ai/mlc-llm/blob/7c135b83b1c96acdfcb313a76d70ec6987f8dd87/python/mlc_chat/rest.py#L167-L181

```python3
prev_txt = ""
async for content in AsyncChatCompletionStream():
    if content:
        valid_content = content.replace('�', '')
        chunk = ChatCompletionStreamResponse(
            choices=[
                ChatCompletionResponseStreamChoice(
                    index=0,
                    delta=DeltaMessage(
                        role="assistant", content=valid_content[len(prev_txt):]
                    ),
                    finish_reason="stop",
                )
            ]
        )
        prev_txt = valid_content
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unicode/UTF-8 Character Handling Issue in REST API "/v1/chat/completions" Endpoint #804

🐛 Bug

To Reproduce

Expected behavior

Environment

Potential fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	prev_txt = ""
	async for content in AsyncChatCompletionStream():
	if content:
	chunk = ChatCompletionStreamResponse(
	choices=[
	ChatCompletionResponseStreamChoice(
	index=0,
	delta=DeltaMessage(
	role="assistant", content=content[len(prev_txt) :]
	),
	finish_reason="stop",
	)
	]
	)
	prev_txt = content

Unicode/UTF-8 Character Handling Issue in REST API "/v1/chat/completions" Endpoint #804

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Potential fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions