We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It is currently bugged. See results of quantize-stats on M1:
quantize-stats
$ ./quantize-stats -m models/7B/ggml-model-f16.bin Loading model llama.cpp: loading model from models/7B/ggml-model-f16.bin llama_model_load_internal: format = ggjt v1 (latest) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 256 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: f16 = 1 llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 59.11 KB llama_model_load_internal: mem required = 14645.07 MB (+ 2052.00 MB per state) llama_init_from_file: kv self size = 256.00 MB note: source model is f16 testing 291 layers with max size 131072000 q4_0 : rmse 0.00222150, maxerr 0.18429124, 95pct<0.0040, median<0.0018 q4_1 : rmse 0.00360044, maxerr 0.26373291, 95pct<0.0066, median<0.0028 main: total time = 93546.68 ms
The RMSE is too high - worse than Q4_0.
There is a bug in the following piece of code:
https://github.com/ggerganov/llama.cpp/blob/180b693a47b6b825288ef9f2c39d24b6eea4eea6/ggml.c#L922-L955
We should fix it
The text was updated successfully, but these errors were encountered:
684da25
No branches or pull requests
It is currently bugged. See results of
quantize-stats
on M1:The RMSE is too high - worse than Q4_0.
There is a bug in the following piece of code:
https://github.com/ggerganov/llama.cpp/blob/180b693a47b6b825288ef9f2c39d24b6eea4eea6/ggml.c#L922-L955
We should fix it
The text was updated successfully, but these errors were encountered: