File tree Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Original file line number Diff line number Diff line change @@ -4,6 +4,17 @@ To install, run
4
4
``` make LLAMA_HIPBLAS=1 ```
5
5
To use ROCM, set GPU layers with --gpulayers when starting koboldcpp
6
6
7
+ Comparison with OpenCL using 6800xt
8
+ | Model | Offloading Method | Time Taken - Processing 593 tokens| Time Taken - Generating 200 tokens| Total Time | Perf. Diff.
9
+ | -----------------| ----------------------------| --------------------| --------------------| ------------| ---|
10
+ | Robin 7b q6_K |CLBLAST 6-t, All Layers on GPU | 6.8s (11ms/T) | 12.0s (60ms/T) | 18.7s (10.7T/s) | 1x
11
+ | Robin 7b q6_K |ROCM 1-t, All Layers on GPU | 1.4s (2ms/T) | 5.5s (28ms/T) | 6.9s (29.1T/s)| ** 2.71x**
12
+ | Robin 13b q5_K_M |CLBLAST 6-t, All Layers on GPU | 10.9s (18ms/T) | 16.7s (83ms/T) | 27.6s (7.3T/s) | 1x
13
+ | Robin 13b q5_K_M |ROCM 1-t, All Layers on GPU | 2.4s (4ms/T) | 7.8s (39ms/T) | 10.2s (19.6T/s)| ** 2.63x**
14
+ | Robin 33b q4_K_S |CLBLAST 6-t, 46/63 Layers on GPU | 23.2s (39ms/T) | 48.6s (243ms/T) | 71.9s (2.8T/s) | 1x
15
+ | Robin 33b q4_K_S |CLBLAST 6-t, 50/63 Layers on GPU | 25.5s (43ms/T) | 44.6s (223ms/T) | 70.0s (2.9T/s) | 1x
16
+ | Robin 33b q4_K_S |ROCM 6-t, 46/63 Layers on GPU | 14.6s (25ms/T) | 44.1s (221ms/T) | 58.7s (3.4T/s)| ** 1.19x**
17
+
7
18
--------
8
19
A self contained distributable from Concedo that exposes llama.cpp function bindings, allowing it to be used via a simulated Kobold API endpoint.
9
20
You can’t perform that action at this time.
0 commit comments