-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Don't crash when prompt cannot be tokenized #2580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Through your modification, it seems to have not crashed now. it is useful for me ! but it seems your useful modification has not be merged ? befor
now
|
Oh I got some things wrong, when I use special Chinese symbols, similar errors still occur. For example, the Chinese symbol “?” is not in the vocabulary list, and the new main program will still have errors. befor
now
|
I'm getting the same error with the deepseek coder instruct 7B and 33B variants when trying to run a grammar file. ./main -f /home/igoforth/.local/src/sd_gpu/src/prompt.txt -m /home/igoforth/.local/src/ai_detect/deepseek-coder-33B-instruct-GGUF/deepseek-coder-33b-instruct.Q4_K_M.gguf --grammar-file /home/igoforth/.local/src/sd_gpu/src/vuln.gbnf -nommq -ngl 64 -c 512
### Response:terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
[1] 3675739 IOT instruction (core dumped) I believe my problem was mentioned here by TB: #3633 Worth noting that I get this crash when running deepseek on the latest master when I attempt to offload all layers to gpu (no grammar): ./main -f /home/igoforth/.local/src/sd_gpu/src/prompt.txt -m /home/igoforth/.local/src/ai_detect/deepseek-coder-33B-instruct-GGUF/deepseek-coder-33b-instruct.Q4_K_M.gguf -nommq -ngl 65 -c 512
CUDA error 700 at /home/igoforth/.local/src/llama.cpp.cublas/ggml-cuda.cu:7546: an illegal memory access was encountered
current device: 0 But it runs fine when I offload all but one layer ?? The commits at #3633 allow me to offload all layers, but the grammar problem remains |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
This program verifies that a given gguf model file can tokenize all potential valid characters. Since llama.cpp currently raises an exception when tokenization is not possible[1], this tool helps verifying that valid ascii and utf-8 will always be properly tokenized. [1] ggml-org#2580
This program verifies that a given gguf model file can tokenize all potential valid characters. Since llama.cpp currently raises an exception when tokenization is not possible[1], this tool helps verifying that valid ascii and utf-8 will always be properly tokenized. [1] ggml-org#2580
This program verifies that a given gguf model file can tokenize all potential valid characters. Since llama.cpp currently raises an exception when tokenization is not possible[1], this tool helps verifying that valid ascii and utf-8 will always be properly tokenized. [1] ggml-org#2580
As observed in #2379 (comment), custom vocabularies might not include tokens to represent all prompts. In the above case the static instruction mode prefix / suffix could not be represented, even if not used.
In that situation,
main
is killed by astd::out_of_range
exception out of the tokenizer. It might make sense to give a better error message in that case and/or clarify the assumptions llama.cpp makes about the vocabulary.Backtrace
The text was updated successfully, but these errors were encountered: