server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921

infozzdatalabs · 2024-03-07T11:31:07Z

I'm unsure whether this is a limitation of the OpenAI library or a result of poor server management. However, after extensively testing various models using the latest server image in Docker with CUDA, I've come to a conclusion. It seems impossible to run a model that utilizes a chat template different from ChatML along with OpenAI library. All attempts resulted in failures in responses. This includes the model located at https://huggingface.co/mlabonne/AlphaMonarch-7B-GGUF, which I requested some time ago. I apologize if this isn't considered a bug, but I'm at a loss for what to do next. Thank you in advance.

ngxson · 2024-03-07T11:36:52Z

FYI, we already support some templates including AlphaMonarch that you mentioned: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template

You can run server with --verbose option, it will show the formatted chat in the log.

If the server still uses chatml for monarch, maybe the GGUF model file does not have chat template inside it, or either you're using an old version of server

infozzdatalabs · 2024-03-07T11:51:54Z

That's curious; it seems to be formatted fine. I think my issue must be related to the use of special characters like {} or :. That brings another question: is there any way to use a chat template that is not "hardcoded" in llama.cpp? They are always showing newer models with better performance, but I can't use them because of that.

ngxson · 2024-03-07T12:40:49Z

In the same wiki page, I mentioned that for now you can use /completions endpoint instead of /chat/completions, as it allows you to enter the raw prompt that you format yourself.

I agree that it's not easy to use, but AFAIK there are currently 2 most used chat templates and none of them are suitable to be implemented in llama.cpp:

Jinja template: the parser is too complicated to implement in cpp
LM Studio format: personally I don't like the idea of this format, because it miss many details like where to place BOS, EOS,...

github-actions · 2024-04-21T01:06:36Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

infozzdatalabs added the bug-unconfirmed label Mar 7, 2024

ngxson mentioned this issue Mar 7, 2024

Server: possibility of customizable chat template? #5922

Closed

github-actions bot added the stale label Apr 7, 2024

github-actions bot closed this as completed Apr 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921

server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921

infozzdatalabs commented Mar 7, 2024

ngxson commented Mar 7, 2024

Uh oh!

infozzdatalabs commented Mar 7, 2024

Uh oh!

ngxson commented Mar 7, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Apr 21, 2024

Uh oh!

server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921

server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921

Comments

infozzdatalabs commented Mar 7, 2024

ngxson commented Mar 7, 2024

Uh oh!

infozzdatalabs commented Mar 7, 2024

Uh oh!

ngxson commented Mar 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 21, 2024

Uh oh!

ngxson commented Mar 7, 2024 •

edited

Loading