-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Mis-leading whisper-bench (Now with more Macs) #3139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Linux results create a new winner Bench : PC (Perf Linux) = 2.975 So, same PC gets 25% and 47% faster in the results vs Win11. Actually the Laptop is in this case running Linux off an external USB so overall experience is a tad clunky but this won't affect the runtimes |
Rented some Mac Minis today so I now have a comparison of M1, M2, and M2 Pro to add to the list and a new 2nd place holder
And the runner up (Pi 5) can even type faster than me The world record typing speed is 360 wpm :) |
Thought I'd share my results. I performed 3 tests on a base M1 MBA, first using
Second using Whisper with mlx-audio:
Third using Parakeet with mlx-audio:
|
Have any of you done any comparisons with WhisperKit? |
Not me - that one is too platform-specific for my liking. |
Fair enough. |
it'd be interesting to hear from @kth8 why the speeds are so different when they're doing the same thing e.g. Is it exactly the same? The Whisper.-cli version for example outputs all the text while grabbing all the tokens |
Another option I've been using is Google Gemini 2.5. Considering the potato specs of my MBA, it's the fastest option and can handle complex systems prompts with diarization.
|
Using |
Yet to delve into audio. whisper-bench tends to go for SDL-relates stuff on initial examination. There's conceptual FFMPEG for Linux but I want it portable... Anyway - New winner (though obviously not for long) for my version(s) Further investigations and I get mine down to 93 secs under Windows (the lower figures got me into a frenzy of benchmarking) The full output after the end of the Aladdin transcription is now...
Linux should shave some more off that as well (but I need to re-build my ext Linux drive so that's a --- coming soon) The trick was rather unintuitive. DON'T use CUDA / BLAS / OpenVINO. Stick to pure Vulkan with CPU for backup. That's pretty portable (but not perfect) for the non-Mac world. A few things are high on the let's explore list now. GPU support may be a nice option for OpenVINO (only CPU easily available AFAIK) And, of course, playing with Audio is ripe for more speed increases. The Sample Rate on the test audio is 22.5 khz which must be producing way more input data than needed (bright thought, I'll resample to 8k + see what happens) I've got a really junky old laptop that should prove useful for non-CUDA testing. My fork of the project allows me to select which backends to use which is also a great benefit. |
Uh oh!
There was an error while loading. Please reload this page.
The numbers returned by whisper-bench are misleading
I've got a Mac M4 Mini 256G (the cheap one) and a Lenovo Laptop with a 4070 GPU in it. I usually use the laptop in Hybrid mode (Performance mode gets noisy) - the Mac is, of course, silent all the time anyway
I've been relying on whisper-bench to indicate which device is faster, thing is, it's wrong (or more accurately, rather misleading)
I just got the M4 and my Laptop on fairly equal footing when it comes to whisper.cpp facilities as both have OpenVINO available and one has an M4 with OpenCL while the other has CUDA
I'd always thought that the Mac was slower than the Laptop - well, until I drag-raced them against each other via CLI and a 35 minute public domain recording of Aladdin and the Magic Lamp
The figures come out like this (all tests use the medium.en model)...
whisper-bench (total runtime in seconds)
PC (Perf Mode) = 3.694
PC (Hybrid Mode) = 3.721
Mac M4 Mini 256G = 6.929
So the laptop's performance mode is 187.6% the speed of the M4?
Or is it?
Next I run whisper.cli over the recording of Aladdin and the Magic Lamp (35m 06s - mp3@64k - 16M filesize - 5379 words)
whisper-cli (total runtime in seconds + words translated per minute)
Mac M4 Mini 256G = 172.623 = 1869.623 wpm
PC (Perf Mode) = 186.730 = 1728.378 wpm
PC (Hybrid Mode) = 202.850 = 1591.027 wpm
Now the Mac is 108% the speed of the Laptop in Performance Mode
In the real world of course we'd be using whisper.cpp for things more like the whisper-cli test
If anyone wants to run the same test against their setup you can find the audio file here (Archive.org)
Mac was 1/3rd the price of the Laptop (but the Laptop plays better games)
Suppose I'd better switch to Linux on the Laptop and try the same test there next...
The text was updated successfully, but these errors were encountered: