[Feature request] WASM WebGPU #103

mark-beeby · 2022-10-27T11:27:13Z

It's clear that leveraging a GPU makes processing faster, and I believe in principle WebGPU is available in SIMD. Is it even feasible to integrate with the GPU where available in Chrome etc?

ggerganov · 2022-11-01T21:14:27Z

I'm not familiar with the WebGPU API.
If you demonstrate a basic matrix multiplication example using WebGPU, and it does not look too complicated, I might give it a try.

niklaskorz · 2022-12-09T10:38:06Z

I have some experience with WebGPU and might have a look at this. Note that WebGPU would allow GPU-based computation without depending on any vendor specific libraries like CUDA not only for the web but also natively (with Vulkan, DX12 or Metal), by using dawn or wgpu.

gut4 · 2022-12-24T20:19:22Z

This can be helpful https://github.com/juj/wasm_webgpu

sandorkonya · 2023-03-20T13:19:00Z

@niklaskorz any chance that you would look at this? That would give even a further kick to this project, (or did I miss anything relevant and it's been solved?)

patrickinminneapolis · 2023-03-20T13:50:51Z

I started looking into it -- its very easy to link wasm_webgpu into emscripten, then in principle you should be implement the matrix multiplication example from https://github.com/milhidaka/webgpu-blas -- I have done this -- but I am running to an issue with my shader. I am really curious if WebGPU will give us real-time streaming performance.

ggerganov · 2023-03-20T14:06:49Z

On a similar topic, recently I found this project: https://github.com/xenova/transformers.js

It has a very efficient inference of Whisper tiny using WASM. They seem to be using something called ONNX Runtime. Although adapting to such a framework is out of scope for whisper.cpp, it seems like there is still a lot to gain in the existing WASM implementation. Even without using WASM SIMD, it seems to be possible to achieve much higher performance.

I wonder if there is something that could be done in ggml to speed up the WASM processing. Even if we don't reach ONNX Runtime performance level, it would still be very nice to improve the existing speed.

Regarding WebGPU: would be great if someone provides a PoC. Transformers.js announced they will support WebGPU soon too, so it should be possible.

Edit: Btw, is there something like WASM BLAS ?

erkkimon · 2024-06-12T20:16:42Z

Now TransformersJS seems to have some kind of WebGPU implementation available. For those interested, check out this branch: xenova/whisper-web@main...experimental-webgpu

mark-beeby changed the title ~~[Feature request] WebGPU~~ [Feature request] WASM WebGPU Oct 27, 2022

ggerganov added the question Further information is requested label Oct 27, 2022

westurner mentioned this issue Sep 1, 2023

Instructions on how to build a wasm ggml. ggml-org/ggml#419

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature request] WASM WebGPU #103

[Feature request] WASM WebGPU #103

mark-beeby commented Oct 27, 2022

ggerganov commented Nov 1, 2022

Uh oh!

niklaskorz commented Dec 9, 2022

Uh oh!

gut4 commented Dec 24, 2022

Uh oh!

sandorkonya commented Mar 20, 2023

Uh oh!

patrickinminneapolis commented Mar 20, 2023 •

edited

Loading

Uh oh!

ggerganov commented Mar 20, 2023 •

edited

Loading

Uh oh!

erkkimon commented Jun 12, 2024

Uh oh!

[Feature request] WASM WebGPU #103

[Feature request] WASM WebGPU #103

Comments

mark-beeby commented Oct 27, 2022

ggerganov commented Nov 1, 2022

Uh oh!

niklaskorz commented Dec 9, 2022

Uh oh!

gut4 commented Dec 24, 2022

Uh oh!

sandorkonya commented Mar 20, 2023

Uh oh!

patrickinminneapolis commented Mar 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Mar 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erkkimon commented Jun 12, 2024

Uh oh!

patrickinminneapolis commented Mar 20, 2023 •

edited

Loading

ggerganov commented Mar 20, 2023 •

edited

Loading