Skip to content

Can't disable gpu #1762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thewh1teagle opened this issue Jan 13, 2024 · 13 comments
Open

Can't disable gpu #1762

thewh1teagle opened this issue Jan 13, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@thewh1teagle
Copy link
Contributor

thewh1teagle commented Jan 13, 2024

whisper_context_params.use_gpu = false;
doesn't work. it still trying to use opencl and leads to crash (specific in my case with opencl)

I use it in my project vibe
And this option very important because I want to give users the best possible speed with GPU but fallback in case of error.

@bobqianic bobqianic added the bug Something isn't working label Jan 13, 2024
@bobqianic
Copy link
Collaborator

Hi @slaren , is there a way to completely turn off OpenCL during runtime? Thanks!

@ggerganov
Copy link
Member

Currently, there is no way to disable the GPU completely when the project is built with OpenCL support. Will think about fixing this.

In the meantime, does the information from #888 help in anyway?

@thewh1teagle
Copy link
Contributor Author

@ggerganov
It doesn't help. currenly I use openBlas so at least the performance is much better than without.
Looking to improve it with the project vibe to get the best possible

@chuck-fyn
Copy link

@ggerganov I am also trying to turn off GPU use to allow for background processing on the iphone. Apologies if this is obvious, but is it possible for me to turn off the OpenCL support so that I can turn off the GPU use?

@ggerganov
Copy link
Member

You can easily update ggml.c to avoid all GPU calls (CUDA, OpenCL, etc.) if a global flag is set. For example here:

https://github.com/ggerganov/whisper.cpp/blob/1f50a7d29f85f221368e81201780e0c8dd631076/ggml.c#L9816-L9825

You can add a void ggml_gpu_set(bool enable); call that sets a global boolean flag and check the flag before each GPU call in ggml.c.

This is currently not officially supported in ggml because I want to figure out a better API. But for quick workaround, I think this is the only option atm.

@thewh1teagle
Copy link
Contributor Author

thewh1teagle commented Jan 17, 2024

@ggerganov
I think that eventually it will be useful having is_avaibale() function for each gpu method (cuda, coreml etc)

@thewh1teagle
Copy link
Contributor Author

@ggerganov
Can we somehow get is_available() functions per each GPU platform? so we can easily decide which to use?
I just added coreml support for vibe app and the performance incredible. (20x faster and even more)

Also about the option for disable gpu using use_gpu = false, do you have any progress / plans about it?
I'm eager to add support for GPU for Linux and Windows as well.

@WilliamTambellini
Copy link
Contributor

hi,
same issue on linux with a cuda build: still seems to init and use the cuda gpu despite the '-ng' cli argument:

./cuda/main -m ggml-base.en.bin -f samples/jfk.wav -ng
...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ:   no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
  Device 0: Quadro RTX 3000, compute capability 7.5, VMM: yes
...

best

@slaren
Copy link
Member

slaren commented Mar 28, 2024

@WilliamTambellini this no longer happens with the CUDA backend after the sync with ggml from yesterday.

@WilliamTambellini
Copy link
Contributor

Tks @slaren Superb, I will pull and rebuild and retest. Congrats.

@WilliamTambellini
Copy link
Contributor

Tks @slaren @ggerganov
1.5.4 is already few months old, from Jan 5th.
Would you mind doing a new release?
Best

@ggerganov
Copy link
Member

ggerganov commented Apr 9, 2024

I'll probably make a new one soon, yes

@thewh1teagle
Copy link
Contributor Author

thewh1teagle commented May 23, 2024

Is there an update regrading this feature?
Also, I can see that ggml.c check if the cpu has avx2 / cuda etc at compile time rather than at runtime using __builtin_cpu_supports("...")
That results in crashing for instance if the cpu doesn't support avx2 while instead we can create assertion for all of that with better errors and maybe choose what gpu platform to use (not sure if possible) for instance I would like to use by default CLBLast but if Cuda is available then use it on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants