Great Repository!
Is it within your scope to implement a webGPU accelerated version of Whisper?
Not sure if this helps, but there is a C port for Whisper wirh CPU implementation, and as mentioned in this discussion, the main thing that needs to be offloaded to the GPU is the GGML_OP_MUL_MAT operator.
thy