Skip to content

how to set this chat_template in server? #5974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wac81 opened this issue Mar 10, 2024 · 8 comments
Closed

how to set this chat_template in server? #5974

wac81 opened this issue Mar 10, 2024 · 8 comments
Labels
enhancement New feature or request stale

Comments

@wac81
Copy link

wac81 commented Mar 10, 2024

how to set this chat_template in openchat?
because i watched output it's difference from ./server and python -m llama.cpp.server. then i thought, may is difference chat_template made this?

openchat chat_template:
Using gguf chat template: {{ bos_token }}{% for message in messages %}{{ 'GPT4 Correct ' + message['role'].title() + ': ' + message['content'] + '<|end_of_turn|>'}}{% endfor %}{% if add_generation_prompt %}{{ 'GPT4 Correct Assistant:' }}{% endif %}

how to set chat_template in ./server with --chat-template

@wac81 wac81 added the enhancement New feature or request label Mar 10, 2024
@phymbert phymbert assigned phymbert and ngxson and unassigned phymbert Mar 10, 2024
@phymbert
Copy link
Collaborator

@ngxson
Copy link
Collaborator

ngxson commented Mar 10, 2024

We don’t support custom chat template for now, but you can use /completions endpoint (not /chat/completions) and input the formatted chat with the template that you made

@wac81
Copy link
Author

wac81 commented Mar 10, 2024

I have an extension of that problem, ./server and python -m llama.cpp.server startup get inconsistent results, server is way worse and less controllable, I observed the python -m startup method, it should be using gguf's chat_template, I don't know if I'm right or not, but the result is very different, I am using openchat-0106.

I have an extension of that problem, ./server and python -m llama.cpp.server startup get inconsistent results, server is way worse and less controllable, I observed the python -m startup method, it should be using gguf's chat_template, I don't know if I'm right or not, but the result is very different, I am using openchat-0106.

@wac81
Copy link
Author

wac81 commented Mar 10, 2024

Have a look to https://github.com/ggerganov/llama.cpp/wiki/Templates-supported-by-llama_chat_apply_template

I've definitely seen the link, but thanks anyway

@teleprint-me
Copy link
Contributor

You can manually set the chat template using gguf-py. Look at set-metadata.py in examples. I would go into more detail, but I have to go to work. I can post later if you're still having trouble understanding how to do it.

@ngxson
Copy link
Collaborator

ngxson commented Mar 10, 2024

@teleprint-me He's using custom self-made chat template {{ 'GPT4 Correct ' + message['role'].title()

This is not a common chat template that is supported by llama.cpp, so it won't work. Please see discussion: #5922 (comment)

@teleprint-me
Copy link
Contributor

Yes, I understand how the template works. It doesn't change anything. My point stands.

@github-actions github-actions bot added the stale label Apr 10, 2024
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

4 participants