Closed
Description
Somewhat related to this thread.
Is it within scope to implement a webGPU accelerated version of Whisper?
Not sure if this helps, but there is a C port for Whisper wirh CPU implementation, and as mentioned in this discussion, the main thing that needs to be offloaded to the GPU is the GGML_OP_MUL_MAT operator.