-
Notifications
You must be signed in to change notification settings - Fork 12k
server: Unable to Utilize Models Outside of 'ChatML' with OpenAI Library #5921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
FYI, we already support some templates including AlphaMonarch that you mentioned: https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template You can run server with If the server still uses chatml for monarch, maybe the GGUF model file does not have chat template inside it, or either you're using an old version of server |
That's curious; it seems to be formatted fine. I think my issue must be related to the use of special characters like |
In the same wiki page, I mentioned that for now you can use I agree that it's not easy to use, but AFAIK there are currently 2 most used chat templates and none of them are suitable to be implemented in llama.cpp:
|
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I'm unsure whether this is a limitation of the OpenAI library or a result of poor server management. However, after extensively testing various models using the latest server image in Docker with CUDA, I've come to a conclusion. It seems impossible to run a model that utilizes a chat template different from ChatML along with OpenAI library. All attempts resulted in failures in responses. This includes the model located at https://huggingface.co/mlabonne/AlphaMonarch-7B-GGUF, which I requested some time ago. I apologize if this isn't considered a bug, but I'm at a loss for what to do next. Thank you in advance.
The text was updated successfully, but these errors were encountered: