You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My computer is M1 Max Mac Studio with 32 Cores of GPU with 64 GB of RAM. macOS version is Sonoma 14.4.1.
I run llama-bench from commit 4cc120c and it shows low GPU usage for prompt processing. Of course, inferences on main and server show same low GPU usages.
Uh oh!
There was an error while loading. Please reload this page.
My computer is M1 Max Mac Studio with 32 Cores of GPU with 64 GB of RAM. macOS version is Sonoma 14.4.1.
I run
llama-bench
from commit 4cc120c and it shows low GPU usage for prompt processing. Of course, inferences onmain
andserver
show same low GPU usages.In the above image, I run benchmark for IQ2_XXS, IQ2_XS, IQ2_S, IQ2_M, and Q2_K_S but IQ1_S and IQ1_M from https://huggingface.co/MaziyarPanahi/Mixtral-8x22B-v0.1-GGUF will show same low GPU usage.
The text was updated successfully, but these errors were encountered: