You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/perplexity/README.md
+41-1Lines changed: 41 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -32,7 +32,7 @@ In addition to the KL divergence the following statistics are calculated with `-
32
32
33
33
## LLaMA 3 8b Scoreboard
34
34
35
-
Results are sorted by Kullback-Leibler divergence relative to FP16.
35
+
Results were generated using the CUDA backend and are sorted by Kullback-Leibler divergence relative to FP16.
36
36
The "WT" importance matrices were created using varying numbers of Wikitext tokens and can be found [here](https://huggingface.co/JohannesGaessler/llama.cpp_importance_matrices/blob/main/imatrix-llama_3-8b-f16-2.7m_tokens.dat).
37
37
38
38
| Quantization | imatrix | Model size [GiB]| PPL | ΔPPL | KLD | Mean Δp | RMS Δp |
@@ -89,6 +89,8 @@ K-quants score better on mean Δp than the legacy quants than e.g. KL divergence
0 commit comments