Skip to content

Commit 4d56ad8

Browse files
authored
Update README.md
1 parent 21f9308 commit 4d56ad8

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,17 @@ To install, run
44
```make LLAMA_HIPBLAS=1```
55
To use ROCM, set GPU layers with --gpulayers when starting koboldcpp
66

7+
Comparison with OpenCL using 6800xt
8+
| Model | Offloading Method | Time Taken - Processing 593 tokens| Time Taken - Generating 200 tokens| Total Time | Perf. Diff.
9+
|-----------------|----------------------------|--------------------|--------------------|------------|---|
10+
| Robin 7b q6_K |CLBLAST 6-t, All Layers on GPU | 6.8s (11ms/T) | 12.0s (60ms/T) | 18.7s (10.7T/s) | 1x
11+
| Robin 7b q6_K |ROCM 1-t, All Layers on GPU | 1.4s (2ms/T) | 5.5s (28ms/T) | 6.9s (29.1T/s)| **2.71x**
12+
| Robin 13b q5_K_M |CLBLAST 6-t, All Layers on GPU | 10.9s (18ms/T) | 16.7s (83ms/T) | 27.6s (7.3T/s) | 1x
13+
| Robin 13b q5_K_M |ROCM 1-t, All Layers on GPU | 2.4s (4ms/T) | 7.8s (39ms/T) | 10.2s (19.6T/s)| **2.63x**
14+
| Robin 33b q4_K_S |CLBLAST 6-t, 46/63 Layers on GPU | 23.2s (39ms/T) | 48.6s (243ms/T) | 71.9s (2.8T/s) | 1x
15+
| Robin 33b q4_K_S |CLBLAST 6-t, 50/63 Layers on GPU | 25.5s (43ms/T) | 44.6s (223ms/T) | 70.0s (2.9T/s) | 1x
16+
| Robin 33b q4_K_S |ROCM 6-t, 46/63 Layers on GPU | 14.6s (25ms/T) | 44.1s (221ms/T) | 58.7s (3.4T/s)| **1.19x**
17+
718
--------
819
A self contained distributable from Concedo that exposes llama.cpp function bindings, allowing it to be used via a simulated Kobold API endpoint.
920

0 commit comments

Comments
 (0)