Skip to content

Commit 78b00dd

Browse files
README: updated introduction (#5343)
* README: updated introduction * readme : update --------- Co-authored-by: Georgi Gerganov <[email protected]>
1 parent c6b3955 commit 78b00dd

File tree

1 file changed

+30
-19
lines changed

1 file changed

+30
-19
lines changed

README.md

Lines changed: 30 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
[Roadmap](https://github.com/users/ggerganov/projects/7) / [Project status](https://github.com/ggerganov/llama.cpp/discussions/3471) / [Manifesto](https://github.com/ggerganov/llama.cpp/discussions/205) / [ggml](https://github.com/ggerganov/ggml)
88

9-
Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
9+
Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others) in pure C/C++
1010

1111
### Hot topics
1212

@@ -58,30 +58,35 @@ Inference of [LLaMA](https://arxiv.org/abs/2302.13971) model in pure C/C++
5858

5959
## Description
6060

61-
The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quantization on a MacBook
61+
The main goal of `llama.cpp` is to enable LLM inference with minimal setup and state-of-the-art performance on a wide
62+
variety of hardware - locally and in the cloud.
6263

63-
- Plain C/C++ implementation without dependencies
64-
- Apple silicon first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
64+
- Plain C/C++ implementation without any dependencies
65+
- Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks
6566
- AVX, AVX2 and AVX512 support for x86 architectures
66-
- Mixed F16 / F32 precision
67-
- 2-bit, 3-bit, 4-bit, 5-bit, 6-bit and 8-bit integer quantization support
68-
- CUDA, Metal, OpenCL, SYCL GPU backend support
67+
- 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use
68+
- Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP)
69+
- Vulkan, SYCL, and (partial) OpenCL backend support
70+
- CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity
6971

70-
The original implementation of `llama.cpp` was [hacked in an evening](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022).
71-
Since then, the project has improved significantly thanks to many contributions. This project is mainly for educational purposes and serves
72-
as the main playground for developing new features for the [ggml](https://github.com/ggerganov/ggml) library.
72+
Since its [inception](https://github.com/ggerganov/llama.cpp/issues/33#issuecomment-1465108022), the project has
73+
improved significantly thanks to many contributions. It is the main playground for developing new features for the
74+
[ggml](https://github.com/ggerganov/ggml) library.
7375

7476
**Supported platforms:**
7577

7678
- [X] Mac OS
7779
- [X] Linux
7880
- [X] Windows (via CMake)
7981
- [X] Docker
82+
- [X] FreeBSD
8083

8184
**Supported models:**
8285

8386
- [X] LLaMA 🦙
8487
- [x] LLaMA 2 🦙🦙
88+
- [X] [Mistral AI v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
89+
- [x] [Mixtral MoE](https://huggingface.co/models?search=mistral-ai/Mixtral)
8590
- [X] Falcon
8691
- [X] [Alpaca](https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca)
8792
- [X] [GPT4All](https://github.com/ggerganov/llama.cpp#using-gpt4all)
@@ -95,7 +100,6 @@ as the main playground for developing new features for the [ggml](https://github
95100
- [X] [Baichuan 1 & 2](https://huggingface.co/models?search=baichuan-inc/Baichuan) + [derivations](https://huggingface.co/hiyouga/baichuan-7b-sft)
96101
- [X] [Aquila 1 & 2](https://huggingface.co/models?search=BAAI/Aquila)
97102
- [X] [Starcoder models](https://github.com/ggerganov/llama.cpp/pull/3187)
98-
- [X] [Mistral AI v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
99103
- [X] [Refact](https://huggingface.co/smallcloudai/Refact-1_6B-fim)
100104
- [X] [Persimmon 8B](https://github.com/ggerganov/llama.cpp/pull/3410)
101105
- [X] [MPT](https://github.com/ggerganov/llama.cpp/pull/3417)
@@ -104,15 +108,14 @@ as the main playground for developing new features for the [ggml](https://github
104108
- [X] [StableLM-3b-4e1t](https://github.com/ggerganov/llama.cpp/pull/3586)
105109
- [x] [Deepseek models](https://huggingface.co/models?search=deepseek-ai/deepseek)
106110
- [x] [Qwen models](https://huggingface.co/models?search=Qwen/Qwen)
107-
- [x] [Mixtral MoE](https://huggingface.co/models?search=mistral-ai/Mixtral)
108111
- [x] [PLaMo-13B](https://github.com/ggerganov/llama.cpp/pull/3557)
109112
- [x] [GPT-2](https://huggingface.co/gpt2)
110113
- [x] [CodeShell](https://github.com/WisdomShell/codeshell)
111114

112115
**Multimodal models:**
113116

114-
- [x] [Llava 1.5 models](https://huggingface.co/collections/liuhaotian/llava-15-653aac15d994e992e2677a7e)
115-
- [x] [Bakllava](https://huggingface.co/models?search=SkunkworksAI/Bakllava)
117+
- [x] [LLaVA 1.5 models](https://huggingface.co/collections/liuhaotian/llava-15-653aac15d994e992e2677a7e)
118+
- [x] [BakLLaVA](https://huggingface.co/models?search=SkunkworksAI/Bakllava)
116119
- [x] [Obsidian](https://huggingface.co/NousResearch/Obsidian-3B-V0.5)
117120
- [x] [ShareGPT4V](https://huggingface.co/models?search=Lin-Chen/ShareGPT4V)
118121
- [x] [MobileVLM 1.7B/3B models](https://huggingface.co/models?search=mobileVLM)
@@ -137,14 +140,22 @@ as the main playground for developing new features for the [ggml](https://github
137140

138141
**UI:**
139142

143+
Unless otherwise noted these projects are open-source with permissive licensing:
144+
145+
- [iohub/collama](https://github.com/iohub/coLLaMA)
146+
- [janhq/jan](https://github.com/janhq/jan) (AGPL)
140147
- [nat/openplayground](https://github.com/nat/openplayground)
141-
- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)
142-
- [withcatai/catai](https://github.com/withcatai/catai)
143-
- [semperai/amica](https://github.com/semperai/amica)
148+
- [LMStudio](https://lmstudio.ai/) (proprietary)
149+
- [LostRuins/koboldcpp](https://github.com/LostRuins/koboldcpp) (AGPL)
150+
- [Mozilla-Ocho/llamafile](https://github.com/Mozilla-Ocho/llamafile)
151+
- [nomic-ai/gpt4all](https://github.com/nomic-ai/gpt4all)
152+
- [ollama/ollama](https://github.com/ollama/ollama)
153+
- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) (AGPL)
144154
- [psugihara/FreeChat](https://github.com/psugihara/FreeChat)
145155
- [ptsochantaris/emeltal](https://github.com/ptsochantaris/emeltal)
146-
- [iohub/collama](https://github.com/iohub/coLLaMA)
147-
- [pythops/tenere](https://github.com/pythops/tenere)
156+
- [pythops/tenere](https://github.com/pythops/tenere) (AGPL)
157+
- [semperai/amica](https://github.com/semperai/amica)
158+
- [withcatai/catai](https://github.com/withcatai/catai)
148159

149160
---
150161

0 commit comments

Comments
 (0)