Skip to content

Conversation

@jackzhxng
Copy link
Collaborator

@jackzhxng jackzhxng commented Oct 16, 2025

Fixes Whisper export + misc seq2seq-related cleanups, which was broken a while back due to transformers changes

Runs on both portable and XNNPack.

Stack from (oldest at bottom):

@jackzhxng
Copy link
Collaborator Author

Note - ignore the CI test failures, they are unrelated. If you click into the logs you can see that tests are passing until it runs out of disk space, additionally I've verified these test locally as well.

@jackzhxng jackzhxng changed the title Fix Whisper Fix Whisper and T5 Oct 17, 2025
@jackzhxng jackzhxng changed the title Fix Whisper and T5 Fix Whisper Oct 17, 2025
@jackzhxng jackzhxng merged commit 4676609 into huggingface:main Oct 17, 2025
56 of 83 checks passed
@jackzhxng jackzhxng mentioned this pull request Oct 17, 2025
@dylanschoenmakers
Copy link

With this change ASR models are now exported as a single model.pte with encoder and decoder methods instead of a forward correct? Is there a way to still export them as separate encoder/decoder models so that it works with the current version of react-native-executorch?

@jackzhxng
Copy link
Collaborator Author

@dylanschoenmakers yes that's correct, reason is that we usually recommend the one model / multiple method approach since there's less overhead.

cc @chmjkb is this something you'd be able to update for react-native-excecutorch?

@chmjkb
Copy link
Contributor

chmjkb commented Oct 20, 2025

@jackzhxng, sure we'll update it as soon as we can, thanks for fixing this! Any ideas on what the perf improvements should be?

@jackzhxng
Copy link
Collaborator Author

cc @larryliu0820

@larryliu0820
Copy link
Collaborator

@jackzhxng, sure we'll update it as soon as we can, thanks for fixing this! Any ideas on what the perf improvements should be?

I'm getting 3.4 tok/s on whisper-small (not tiny)

@larryliu0820
Copy link
Collaborator

With this change ASR models are now exported as a single model.pte with encoder and decoder methods instead of a forward correct? Is there a way to still export them as separate encoder/decoder models so that it works with the current version of react-native-executorch?

I think having both methods in one file makes sure that they can work together. We also recognize it makes the model artifact much bigger, but I believe using program data separation can mitigate this issue.

@larryliu0820
Copy link
Collaborator

@jackzhxng, sure we'll update it as soon as we can, thanks for fixing this! Any ideas on what the perf improvements should be?

I'm getting 3.4 tok/s on whisper-small (not tiny)

Oh if you are curious about perf improvement between separate files and one file, I don’t think there will be significant perf improvement, but just the benefit I commented earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants