-
Notifications
You must be signed in to change notification settings - Fork 12.5k
Closed
Labels
Apple Metalhttps://en.wikipedia.org/wiki/Metal_(API)https://en.wikipedia.org/wiki/Metal_(API)bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)stale
Description
What happened?
Gemma 2 is slower with FA on Apple Silicon (M3 Max).
Name and Version
version: 3642 (1d1ccce)
built with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.6.0
What operating system are you seeing the problem on?
Mac
Relevant log output
| model | size | params | backend | ngl | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | ---: | ------------: | ---------------: |
| gemma2 2B Q8_0 | 3.17 GiB | 3.20 B | Metal | 99 | 0 | 0 | pp512 | 2360.42 ± 3.71 |
| gemma2 2B Q8_0 | 3.17 GiB | 3.20 B | Metal | 99 | 0 | 0 | tg64 | 85.54 ± 0.05 |
| gemma2 2B Q8_0 | 3.17 GiB | 3.20 B | Metal | 99 | 1 | 0 | pp512 | 1487.45 ± 3.27 |
| gemma2 2B Q8_0 | 3.17 GiB | 3.20 B | Metal | 99 | 1 | 0 | tg64 | 50.99 ± 0.17 |
| gemma2 9B Q8_0 | 10.05 GiB | 10.16 B | Metal | 99 | 0 | 0 | pp512 | 608.84 ± 0.96 |
| gemma2 9B Q8_0 | 10.05 GiB | 10.16 B | Metal | 99 | 0 | 0 | tg64 | 30.29 ± 0.04 |
| gemma2 9B Q8_0 | 10.05 GiB | 10.16 B | Metal | 99 | 1 | 0 | pp512 | 397.25 ± 23.27 |
| gemma2 9B Q8_0 | 10.05 GiB | 10.16 B | Metal | 99 | 1 | 0 | tg64 | 21.33 ± 0.01 |
build: 1d1ccce6 (3642)
Metadata
Metadata
Assignees
Labels
Apple Metalhttps://en.wikipedia.org/wiki/Metal_(API)https://en.wikipedia.org/wiki/Metal_(API)bug-unconfirmedmedium severityUsed to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable)stale