Closed
Description
Hello!
Using this GGUF: https://huggingface.co/LoneStriker/opus-v1.2-llama-3-8b-GGUF
When the output contains any of the special tokens, like <|im_start|>
or <|im_end|>
, they are rendered as empty string. This breaks custom stopping string functionality (e.g. adding "<|im_end|>" to stop strings does not work as it relies on string comparison).
The tokens are tokenized correctly, just not rendered:
main: prompt: '<|im_end|>'
main: number of tokens in prompt = 1
128009 -> ''
main: prompt: '<|im_start|>'
main: number of tokens in prompt = 1
128006 -> ''
I first tested this with old commit:
version: 2243 (201294ae)
201294ae177b308fb3a99dc504dd6d27e8afa907
And replicated with fresh main:
version: 2698 (637e9a86)
637e9a86c220718d008b54842dfd294aa96d3b7a