You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* llama: add llama_chat_apply_template
* test-chat-template: remove dedundant vector
* chat_template: do not use std::string for buffer
* add clarification for llama_chat_apply_template
* llama_chat_apply_template: add zephyr template
* llama_chat_apply_template: correct docs
* llama_chat_apply_template: use term "chat" everywhere
* llama_chat_apply_template: change variable name to "tmpl"
/// Apply chat template. Inspired by hf apply_chat_template() on python.
709
+
/// Both "model" and "custom_template" are optional, but at least one is required. "custom_template" has higher precedence than "model"
710
+
/// NOTE: This function only support some known jinja templates. It is not a jinja parser.
711
+
/// @param tmpl A Jinja template to use for this chat. If this is nullptr, the model’s default chat template will be used instead.
712
+
/// @param chat Pointer to a list of multiple llama_chat_message
713
+
/// @param n_msg Number of llama_chat_message in this chat
714
+
/// @param add_ass Whether to end the prompt with the token(s) that indicate the start of an assistant message.
715
+
/// @param buf A buffer to hold the output formatted prompt. The recommended alloc size is 2 * (total number of characters of all messages)
716
+
/// @param length The size of the allocated buffer
717
+
/// @return The total number of bytes of the formatted prompt. If is it larger than the size of buffer, you may need to re-alloc it and then re-apply the template.
0 commit comments