[Modeling] Fix encoder CPU offloading for whisper#38994
[Modeling] Fix encoder CPU offloading for whisper#38994vasqu merged 6 commits intohuggingface:mainfrom
Conversation
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
zucchini-nlp
left a comment
There was a problem hiding this comment.
Thanks, I like that we removed direct access to weight.data. Can you also un-skip offload tests in whisper and make sure they are green?
For ex:
transformers/tests/models/whisper/test_modeling_whisper.py
Lines 3357 to 3368 in 21cb353
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
@zucchini-nlp @SunMarc Tests unskipped and passing! |
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
run-slow: whisper |
|
This comment contains run-slow, running the specified jobs: models: ['models/whisper'] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@zucchini-nlp Does this test failure indicate something to fix, or is this test noisy? |
vasqu
left a comment
There was a problem hiding this comment.
@kylesayrs Can you rebase/merge? The failing tests are expected, no worries :D
|
@vasqu Merged, thank to hear it :) |
|
Thanks @kylesayrs 🤗 |
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
* fix cpu offloading for whisper Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * unskip offloading tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * revert small change Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> * remove tests Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Purpose
Without this change, attempting to CPU offload the encoder layer raises a device error
Changes
embed_positions.weightattribute directly, leverage the hf hooks attached to theembed_positionsmodule to onload the weight properly.F.embeddingmust be called with an identity matrix, rather than grabbing the weight value directlyTesting
Use the following test script to verify that generation works with the device map
test_whisper_offload.py
Potential Reviewers