Replies: 2 comments
-
basically the equivalent of Grad-Cam for audio with whisper? |
Beta Was this translation helpful? Give feedback.
0 replies
-
Any updates on this? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Does anyone know how to visualize the encoder attention maps with respect to the input spectrograms?
I'm interested in understanding which portions of the spectrogram a whisper-base fine-tuned model is focusing on when making a prediction.
I can extract the attention maps in the forward pass, each is 1500x1500, but I don't know how to map them back to the input spectrogram.
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions