-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Labels
feature requestNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers
Description
🚀 The feature, motivation and pitch
We do not have a way to apply the chat template to a model via the LLM
class, so we often see patterns like this
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
max_model_len, tp_size = 8192, 1
model_name = "deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
llm = LLM(model=model_name, tensor_parallel_size=tp_size, max_model_len=max_model_len, trust_remote_code=True, enforce_eager=True)
messages_list = [
[{"role": "user", "content": "Who are you?"}],
[{"role": "user", "content": "write a quick sort algorithm in python."}],
[{"role": "user", "content": "Write a piece of quicksort code in C++."}],
]
prompt_token_ids = [tokenizer.apply_chat_template(messages, add_generation_prompt=True) for messages in messages_list]
outputs = llm.generate(prompt_token_ids=prompt_token_ids, sampling_params=sampling_params)
generated_text = [output.outputs[0].text for output in outputs]
print(generated_text)
Pass list of messages and apply chat template
from vllm import LLM
model = LLM("...")
messages_list = [
[{"role": "user", "content": "Who are you?"}],
[{"role": "user", "content": "write a quick sort algorithm in python."}],
[{"role": "user", "content": "Write a piece of quicksort code in C++."}],
]
# chat template applied internally
outputs = model.generate(messages_list)
Use the chat template from the llm class
from vllm import LLM
model = LLM("...")
messages_list = [
[{"role": "user", "content": "Who are you?"}],
[{"role": "user", "content": "write a quick sort algorithm in python."}],
[{"role": "user", "content": "Write a piece of quicksort code in C++."}],
]
# use LLM class to apply chat template to prompts
prompt_ids = model.apply_chat_template(messages_list, add_generation_prompt=True)
text = model.apply_chat_template(messages_list, add_generation_prompt=True, tokenize=False)
Alternatives
No response
Additional context
No response
HuangZhen02, ywang96, Joinn99, lijunjun961, ReeceResearch and 6 more
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomers