Skip to content

Eval bug: TikTokenTokenizer has no attribute vocab #12044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zhanghui-china opened this issue Feb 24, 2025 · 6 comments
Closed

Eval bug: TikTokenTokenizer has no attribute vocab #12044

zhanghui-china opened this issue Feb 24, 2025 · 6 comments

Comments

@zhanghui-china
Copy link

zhanghui-china commented Feb 24, 2025

Name and Version

./llama-cli --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3080 Laptop GPU, compute capability 8.6, VMM: yes
version: 1 (3edfa7d)
built with cc (Ubuntu 10.5.0-1ubuntu1~22.04) 10.5.0 for x86_64-linux-gnu

Operating systems

Linux

GGML backends

CUDA

Hardware

RTX 3080 Laptop

Models

Moonlight-16B-A3B-Instruct

Problem description & steps to reproduce

when i run
python convert_hf_to_gguf.py ./Moonlight-16B-A3B-Instruct --outfile Moonlight-16B-A3B-Instruct.gguf --outtype f16
it shows:

First Bad Commit

INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.bfloat16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.8.ffn_down_exps.weight, torch.bfloat16 --> F16, shape = {1408, 2048, 64}
INFO:hf-to-gguf:blk.8.ffn_gate_exps.weight, torch.bfloat16 --> F16, shape = {2048, 1408, 64}
INFO:hf-to-gguf:blk.8.ffn_up_exps.weight, torch.bfloat16 --> F16, shape = {2048, 1408, 64}
INFO:hf-to-gguf:blk.8.exp_probs_b.bias, torch.bfloat16 --> F32, shape = {64}
INFO:hf-to-gguf:blk.8.ffn_gate_inp.weight, torch.bfloat16 --> F32, shape = {2048, 64}
INFO:hf-to-gguf:blk.8.ffn_down_shexp.weight, torch.bfloat16 --> F16, shape = {2816, 2048}
INFO:hf-to-gguf:blk.8.ffn_gate_shexp.weight, torch.bfloat16 --> F16, shape = {2048, 2816}
INFO:hf-to-gguf:blk.8.ffn_up_shexp.weight, torch.bfloat16 --> F16, shape = {2048, 2816}
INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.bfloat16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.8.attn_kv_a_norm.weight, torch.bfloat16 --> F32, shape = {512}
INFO:hf-to-gguf:blk.8.attn_kv_a_mqa.weight, torch.bfloat16 --> F16, shape = {2048, 576}
INFO:hf-to-gguf:blk.8.attn_kv_b.weight, torch.bfloat16 --> F16, shape = {512, 4096}
INFO:hf-to-gguf:blk.8.attn_output.weight, torch.bfloat16 --> F16, shape = {2048, 2048}
INFO:hf-to-gguf:blk.8.attn_q.weight, torch.bfloat16 --> F16, shape = {2048, 3072}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 2048
INFO:hf-to-gguf:gguf: feed forward length = 11264
INFO:hf-to-gguf:gguf: head count = 16
INFO:hf-to-gguf:gguf: key-value head count = 16
INFO:hf-to-gguf:gguf: rope theta = 50000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: experts used count = 6
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:transformers_modules.Moonlight-16B-A3B-Instruct.tokenization_moonshot:Reloaded tiktoken model from Moonlight-16B-A3B-Instruct/tiktoken.model
INFO:transformers_modules.Moonlight-16B-A3B-Instruct.tokenization_moonshot:#words: 163842 - BOS ID: 163584 - EOS ID: 163585
Traceback (most recent call last):
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5139, in
main()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5133, in main
model_instance.write()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 440, in write
self.prepare_metadata(vocab_only=False)
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 433, in prepare_metadata
self.set_vocab()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 4057, in set_vocab
self._set_vocab_gpt2()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 726, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 524, in get_vocab_base
vocab_size = self.hparams.get("vocab_size", len(tokenizer.vocab))
File "/home/zhanghui/anaconda3/envs/kimi/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1108, in getattr
raise AttributeError(f"{self.class.name} has no attribute {key}")
AttributeError: TikTokenTokenizer has no attribute vocab

Relevant log output

INFO:hf-to-gguf:blk.8.attn_norm.weight,       torch.bfloat16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.8.ffn_down_exps.weight,   torch.bfloat16 --> F16, shape = {1408, 2048, 64}
INFO:hf-to-gguf:blk.8.ffn_gate_exps.weight,   torch.bfloat16 --> F16, shape = {2048, 1408, 64}
INFO:hf-to-gguf:blk.8.ffn_up_exps.weight,     torch.bfloat16 --> F16, shape = {2048, 1408, 64}
INFO:hf-to-gguf:blk.8.exp_probs_b.bias,       torch.bfloat16 --> F32, shape = {64}
INFO:hf-to-gguf:blk.8.ffn_gate_inp.weight,    torch.bfloat16 --> F32, shape = {2048, 64}
INFO:hf-to-gguf:blk.8.ffn_down_shexp.weight,  torch.bfloat16 --> F16, shape = {2816, 2048}
INFO:hf-to-gguf:blk.8.ffn_gate_shexp.weight,  torch.bfloat16 --> F16, shape = {2048, 2816}
INFO:hf-to-gguf:blk.8.ffn_up_shexp.weight,    torch.bfloat16 --> F16, shape = {2048, 2816}
INFO:hf-to-gguf:blk.8.ffn_norm.weight,        torch.bfloat16 --> F32, shape = {2048}
INFO:hf-to-gguf:blk.8.attn_kv_a_norm.weight,  torch.bfloat16 --> F32, shape = {512}
INFO:hf-to-gguf:blk.8.attn_kv_a_mqa.weight,   torch.bfloat16 --> F16, shape = {2048, 576}
INFO:hf-to-gguf:blk.8.attn_kv_b.weight,       torch.bfloat16 --> F16, shape = {512, 4096}
INFO:hf-to-gguf:blk.8.attn_output.weight,     torch.bfloat16 --> F16, shape = {2048, 2048}
INFO:hf-to-gguf:blk.8.attn_q.weight,          torch.bfloat16 --> F16, shape = {2048, 3072}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 2048
INFO:hf-to-gguf:gguf: feed forward length = 11264
INFO:hf-to-gguf:gguf: head count = 16
INFO:hf-to-gguf:gguf: key-value head count = 16
INFO:hf-to-gguf:gguf: rope theta = 50000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: experts used count = 6
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:transformers_modules.Moonlight-16B-A3B-Instruct.tokenization_moonshot:Reloaded tiktoken model from Moonlight-16B-A3B-Instruct/tiktoken.model
INFO:transformers_modules.Moonlight-16B-A3B-Instruct.tokenization_moonshot:#words: 163842 - BOS ID: 163584 - EOS ID: 163585
Traceback (most recent call last):
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5139, in <module>
    main()
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5133, in main
    model_instance.write()
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 440, in write
    self.prepare_metadata(vocab_only=False)
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 433, in prepare_metadata
    self.set_vocab()
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 4057, in set_vocab
    self._set_vocab_gpt2()
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 726, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 524, in get_vocab_base
    vocab_size = self.hparams.get("vocab_size", len(tokenizer.vocab))
  File "/home/zhanghui/anaconda3/envs/kimi/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1108, in __getattr__
    raise AttributeError(f"{self.__class__.__name__} has no attribute {key}")
AttributeError: TikTokenTokenizer has no attribute vocab
@grapevine-AI
Copy link

You should use tokenizer.vocab_size instead of len(tokenizer.vocab)

@zhanghui-china
Copy link
Author

You should use tokenizer.vocab_size instead of len(tokenizer.vocab)

thanks,

tokenizer.vocab.values()) change to what?

@grapevine-AI
Copy link

Would you change to tokenizer.get_vocab().values().

Or, delete line if this error is assert max(tokenizer.vocab.values()) < vocab_size.
Assert is not necessary.

@zhanghui-china
Copy link
Author

Would you change to tokenizer.get_vocab().values().

Or, delete line if this error is assert max(tokenizer.vocab.values()) < vocab_size. Assert is not necessary.

thanks a lot.
but when i delete assert , another error happened:
`INFO:hf-to-gguf:Set model tokenizer
INFO:transformers_modules.95583251e616c46a80715897a705cd38659afc27.tokenization_moonshot:Reloaded tiktoken model from 95583251e616c46a80715897a705cd38659afc27/tiktoken.model
INFO:transformers_modules.95583251e616c46a80715897a705cd38659afc27.tokenization_moonshot:#words: 163842 - BOS ID: 163584 - EOS ID: 163585
WARNING:hf-to-gguf:

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:** There are 2 possible reasons for this:
WARNING:hf-to-gguf:** - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:** - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:** Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref: #6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh: 81212dc7cdb7e0c1074ca62c5aeab0d43c9f52b8a737be7b12a777c953027890
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

Traceback (most recent call last):
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5141, in
main()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 5135, in main
model_instance.write()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 440, in write
self.prepare_metadata(vocab_only=False)
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 433, in prepare_metadata
self.set_vocab()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 4059, in set_vocab
self._set_vocab_gpt2()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 728, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 529, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/home/zhanghui/llama.cpp/convert_hf_to_gguf.py", line 716, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
(llama.cpp) zhanghui@zhanghui:~/.cache/huggingface/hub/models--moonshotai--Moonlight-16B-A3B-Instruct/snapshots$`

maybe i should wait for llama.cpp support this model.

model address:
https://huggingface.co/moonshotai/Moonlight-16B-A3B-Instruct

@grapevine-AI
Copy link

Yes, llama.cpp have not implement Moonlight's pre-tokenizer yet.
But, you can substitute other model's pre-tokenizer.
If you want it, you should add this code in line 702.

if chkhsh == "81212dc7cdb7e0c1074ca62c5aeab0d43c9f52b8a737be7b12a777c953027890":
    res = "llama-bpe"

@github-actions github-actions bot added the stale label Mar 29, 2025
Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants