Skip to content

[server] phi-3 uses <|endoftext|> instead of <|end|> when applying chat template in /chat/completions #7432

@andysalerno

Description

@andysalerno

When using phi-3 without the option --chat-template phi3, the tokenization is incorrect.

For example, if I do use --chat-template phi3, here is the log output when I send the message "hi":

{
    "level": "VERB",
    "function": "update_slots",
    "line": 1954,
    "msg": "prompt tokenized",
    "id_slot": 0,
    "id_task": 1,
    "n_ctx": 8192,
    "n_keep": 0,
    "n_prompt_tokens": 7,
    "prompt_tokens": "<s><|system|><|end|><|user|> hi<|end|><|assistant|>"
}

actually the extra space after <|user|> is concerning, it should be a newline, but maybe that's just an artifact of how the log message is formatted.

But here's what happens when the --chat-template phi3 is omitted:

{
    "level": "VERB",
    "function": "update_slots",
    "line": 1954,
    "msg": "prompt tokenized",
    "id_slot": 0,
    "id_task": 0,
    "n_ctx": 8192,
    "n_keep": 0,
    "n_prompt_tokens": 11,
    "prompt_tokens": "<s><|system|><|endoftext|> \n<|user|> hi<|endoftext|> \n<|assistant|>"
}

See how it uses <|endoftext|> (wrong) instead of <|end|> (correct) which causes really bad generation.

I am using the gguf straight from Microsoft, so I guess it is as official as it gets:

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Possibly the problem is in the gguf itself? Even so, it's weird that using the "official" gguf results in incorrect tokenization output from the template applied.

Now, you could just always use --chat-template phi3. But my expectation is the phi3 chat template should automatically be picked up by the detection heuristic, when using the canonical/official Phi-3 models, since they purport to support phi3.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions