Qualcomm AI Engine Direct - GA Whisper #12102

shewu-quic · 2025-06-30T04:27:00Z

Summary:

Add the unit test for Whisper
Support multi-method for Whisper
Add qnn_whisper_runner to run whisper encoder-decoder model

Command:

python examples/qualcomm/oss_scripts/whisper/whisper.py -b build-android  -s <serial> -H <host> -m SM8750 --max_seq_len 1024

Performance:

SM8750:
- avg encoding time: 0.037 s
- avg decoding time: 0.004 s

Accuracy:

Word Error Rate: 0.1941964328289032

cc: @haowhsu-quic, @cccclai , @winskuo-quic

Summary: - Fixed the bug of index_put - Add the unit test for Whisper - Support multi-method for Whisper - Add qnn_whisper_runner to run whisper encoder-decoder model

pytorch-bot · 2025-06-30T04:27:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12102

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ff09c9a with merge base bf9cd34 ():

NEW FAILURE - The following job has failed:

pull / unittest-editable / macos / macos-job (gh)
backends/xnnpack/test/ops/test_gelu.py::TestGelu::test_fp16_gelu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

shewu-quic · 2025-06-30T05:34:43Z

@pytorchbot label "release notes: qualcomm"

facebook-github-bot · 2025-07-01T02:49:46Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai

This is great! Curious if whisper needs special handling for better perf?

shewu-quic · 2025-07-04T02:32:02Z

This is great! Curious if whisper needs special handling for better perf?

Currently, I’m applying these optimizations:

Converting linear layers to convolutional
Tagging quant IO to avoid Q/DQ for input/output

I believe there’s still room for further performance improvements, such as mha2sha and custom annotation for kv cache.

@haowhsu-quic

Summary: - Add the unit test for Whisper - Support multi-method for Whisper - Add qnn_whisper_runner to run whisper encoder-decoder model Command: ``` python examples/qualcomm/oss_scripts/whisper/whisper.py -b build-android -s <serial> -H <host> -m SM8750 --max_seq_len 1024 ``` Performance: - SM8750: - avg encoding time: 0.037 s - avg decoding time: 0.004 s Accuracy: - Word Error Rate: 0.1941964328289032 cc: @haowhsu-quic, @cccclai , @winskuo-quic

Qualcomm AI Engine Direct - GA Whisper

ff09c9a

Summary: - Fixed the bug of index_put - Add the unit test for Whisper - Support multi-method for Whisper - Add qnn_whisper_runner to run whisper encoder-decoder model

shewu-quic requested review from cccclai, kirklandsign and larryliu0820 as code owners June 30, 2025 04:27

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2025

pytorch-bot bot added the release notes: qualcomm Changes to the Qualcomm backend delegate label Jun 30, 2025

cccclai approved these changes Jul 4, 2025

View reviewed changes

cccclai merged commit fd677ac into pytorch:main Jul 4, 2025
103 of 106 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Qualcomm AI Engine Direct - GA Whisper #12102

Qualcomm AI Engine Direct - GA Whisper #12102

Uh oh!

shewu-quic commented Jun 30, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jun 30, 2025 •

edited

Loading

Uh oh!

shewu-quic commented Jun 30, 2025

Uh oh!

facebook-github-bot commented Jul 1, 2025

Uh oh!

cccclai left a comment

Uh oh!

Uh oh!

shewu-quic commented Jul 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Qualcomm AI Engine Direct - GA Whisper #12102

Qualcomm AI Engine Direct - GA Whisper #12102

Uh oh!

Conversation

shewu-quic commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12102

❌ 1 New Failure

Uh oh!

shewu-quic commented Jun 30, 2025

Uh oh!

facebook-github-bot commented Jul 1, 2025

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shewu-quic commented Jul 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shewu-quic commented Jun 30, 2025 •

edited

Loading

pytorch-bot bot commented Jun 30, 2025 •

edited

Loading