File tree Expand file tree Collapse file tree 1 file changed +4
-2
lines changed Expand file tree Collapse file tree 1 file changed +4
-2
lines changed Original file line number Diff line number Diff line change @@ -18,10 +18,12 @@ The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quant
18
18
19
19
- Plain C/C++ implementation without dependencies
20
20
- Apple silicon first-class citizen - optimized via ARM NEON and Accelerate framework
21
- - AVX2 support for x86 architectures
21
+ - AVX, AVX2 and AVX512 support for x86 architectures
22
22
- Mixed F16 / F32 precision
23
- - 4-bit integer quantization support
23
+ - 4-bit, 5-bit and 8-bit integer quantization support
24
24
- Runs on the CPU
25
+ - OpenBLAS support
26
+ - cuBLAS and CLBlast support
25
27
26
28
The original implementation of ` llama.cpp ` was [ hacked in an evening] ( https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022 ) .
27
29
Since then, the project has improved significantly thanks to many contributions. This project is for educational purposes and serves
You can’t perform that action at this time.
0 commit comments