-
Notifications
You must be signed in to change notification settings - Fork 12.5k
Closed

Description
Hi <3 llama.cpp
@KerfuffleV2 shows us that models converted without metadata
load different: Loading non-metadata:
llama_model_load_internal: BOS token = 1 ' '
llama_model_load_internal: EOS token = 2 ' '
Loading with one converted with external metadata:
llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'
I converted WizardMath-7B-V1.0 to GGUF and here's a couple runs:
ex1:
~/l/b/bin (master) [SIGINT]> ./main -m ~/wizardmath-7b-v1.0.ggmlv3.q4_0.gguf --color -c 2048 --keep -1 -n -1 -t 3 -b 7 -i -r "User:" --in-prefix " " --in-suffix "Assistant:" -f ~/storage/shared/PT/M.txt
main: build = 1015 (226255b)
main: seed = 1692706079
llama_model_loader: loaded meta data with 15 key-value pairs and 291 tensors from /data/data/com.termux/files/home/wizardmath-7b-v1.0.ggmlv3.q4_0.gguf (version GGUF V1 (latest))
..
llama_model_load_internal: format = GGUF V1 (latest) llama_model_load_internal: arch = llama
llama_model_load_internal: vocab type = SPM llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx_train = 2048 llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_head = 32
llama_model_load_internal: n_head_kv = 32 llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128 llama_model_load_internal: n_gqa = 1
llama_model_load_internal: f_norm_eps = 5.0e-06 llama_model_load_internal: n_ff = 11008
llama_model_load_internal: freq_base = 10000.0 llama_model_load_internal: freq_scale = 1
llama_model_load_internal: model type = 7B llama_model_load_internal: model ftype = mostly Q4_0
llama_model_load_internal: model size = 6.74 B llama_model_load_internal: general.name = wizardmath-7b-v1.0.ggmlv3.q4_0.bin
llama_model_load_internal: BOS token = 1 ''
llama_model_load_internal: EOS token = 2 '' llama_model_load_internal: LF token = 13 '<0x0A>'
llama_model_load_internal: ggml ctx size = 0.09 MB llama_model_load_internal: mem required = 3615.73 MB (+ 1024.00 MB per state)
llama_new_context_with_model: kv self size = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 3.49 MB
system_info: n_threads = 3 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
Reverse prompt: 'User'
Input prefix: ' '
Input suffix: 'Assistant:'
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 7, n_predict = -1, n_keep = 42
== Running in interactive mode. ==
- Press Ctrl+C to interject at any time.
- Press Return to return control to LLaMa.
- To return control without starting a new line, end your input with '/'.
- If you want to submit another line, end your input with '\'.
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Please list 3 movie titles.
### Response: We are asked to list 3 movie titles, which means we need to come up with at least 3 different movie titles. Here is a list of 3 movie titles for your reference:
1. Titanic (1997)
2. The Matrix (1999)
3. Toy Story (1995)
These are just some examples, and there are certainly many more movie titles out there. However, these 3 movies have been well-known and popular for a long time, and they represent different genres and styles of filmmaking. Therefore, I believe that these 3 movie titles will not disappoint you.
The answer is: Here are three movie titles: Titanic (1997), The Matrix (1999), and Toy Story (1995).
</s>
The answer is: Three movie titles are: Titanic (1997), The Matrix (1999), and Toy Story (1995)..
</s>
ex2:
### Instruction:
Please list 3 movie titles.
### Response:I'm not sure what you're looking for, but here are some movie titles:
1. The Shawshank Redemption
2. Schindler's List
3. The Godfather
The answer is: Here are three movie titles:
1. The Shawshank Redemption
2. Schindler's List
3. The Godfather.
</s>
The answer is: Here are three movie titles:
1. The Shawshank Redemption
2. Schindler's List
3. The Godfather.
</s>
It appears due to the way the model is converted it's unable to utilise the stop sequence, thus doesn't return control to the User
in this case.
Edit: Error message trying to include metadata:
python3 convert-llama-ggmlv3-to-gguf.py -i ~/wizardmath-7b-v1.0.ggmlv3.q4_0.bin -o ~/wizardM2.gguf -c 2048 -m ~/storage/shared/downloads/wizardmath
* Using config: Namespace(input=PosixPath('/data/data/com.termux/files/home/wizardmath-7b-v1.0.ggmlv3.q4_0.bin'), output=PosixPath('/data/data/com.termux/files/home/wizardM2.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=PosixPath('/data/data/com.termux/files/home/storage/shared/downloads/wizardmath'), vocab_dir=None, vocabtype='spm')
=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
* Scanning GGML input file
* GGML model hyperparameters: <Hyperparameters: n_vocab=32001, n_embd=4096, n_mult=5504, n_head=32, n_layer=32, n_rot=128, n_ff=11008, ftype=2>
Traceback (most recent call last): File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 333, in <module>
main()
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 323, in main(params_override, vocab_override) = handle_metadata(cfg, model.hyperparameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 274, in handle_metadata import convert File "/data/data/com.termux/files/home/llama.cpp/convert.py", line 27, in <module> from sentencepiece import SentencePieceProcessor # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'sentencepiece'
Repo & here's the content of ~/storage/shared/downloads/wizardmath
:
Metadata
Metadata
Assignees
Labels
No labels