Add quantization and partitioner flow in the qualcomm doc #12387

cccclai · 2025-07-11T04:54:06Z

Summary: Add a session to describe how to lower a model to HTP, including quantization step.

Differential Revision: D78117959

pytorch-bot · 2025-07-11T04:54:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12387

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 8485699 with merge base 7c70403 ():

NEW FAILURE - The following job has failed:

pull / test-eval_llama-mmlu-linux / linux-job (gh)
RuntimeError: Command docker exec -t 7b24db41dc47ec52f9a6862301d6331b95be3e5c5bee18ecfb5c60e423080264 /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
examples/models/llama/tests/test_ring_attention.py::TestRingAttention::test_sliding_window_attention
pull / unittest-editable / linux / linux-job (gh) (trunk failure)
examples/models/llama/tests/test_ring_attention.py::TestRingAttention::test_sliding_window_attention

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-07-11T04:54:13Z

This pull request was exported from Phabricator. Differential Revision: D78117959

github-actions · 2025-07-11T04:54:45Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

docs/source/backends-qualcomm.md

shewu-quic

Thank you for the effort to make document better!!

examples/qualcomm/scripts/export_example.py

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-11T16:57:37Z

This pull request was exported from Phabricator. Differential Revision: D78117959

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-11T17:04:08Z

This pull request was exported from Phabricator. Differential Revision: D78117959

cccclai · 2025-07-11T17:14:36Z

Hi @metascroy, this PR is for Qualcomm doc and it includes quantization.

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-11T18:55:43Z

This pull request was exported from Phabricator. Differential Revision: D78117959

) Summary: Pull Request resolved: pytorch#12387 Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-11T22:19:14Z

This pull request was exported from Phabricator. Differential Revision: D78117959

) Summary: Pull Request resolved: pytorch#12387 Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

metascroy · 2025-07-14T17:22:26Z

docs/source/backends-qualcomm.md

+#### Step 2: [Optional] Quantize Your Model
+Choose between quantization approaches, post training quantization (PTQ) or quantization aware training (QAT):
+```python
+from executorch.backends.qualcomm.quantizer.quantizer import QnnQuantizer


Is QnnQuantizer configurable? If so, can we document the configuration?

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-15T20:57:15Z

This pull request was exported from Phabricator. Differential Revision: D78117959

) Summary: Pull Request resolved: pytorch#12387 Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

) Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

) Summary: Pull Request resolved: pytorch#12387 Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

facebook-github-bot · 2025-07-15T21:07:32Z

This pull request was exported from Phabricator. Differential Revision: D78117959

Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

cccclai requested a review from mergennachin as a code owner July 11, 2025 04:54

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 11, 2025

facebook-github-bot added the fb-exported label Jul 11, 2025

cccclai requested review from DannyYuyang-quic, haowhsu-quic, shewu-quic and winskuo-quic July 11, 2025 04:57

cccclai mentioned this pull request Jul 11, 2025

Add quantization documentation to the Qualcomm docs #12259

Closed

DannyYuyang-quic reviewed Jul 11, 2025

View reviewed changes

docs/source/backends-qualcomm.md Outdated Show resolved Hide resolved

shewu-quic approved these changes Jul 11, 2025

View reviewed changes

examples/qualcomm/scripts/export_example.py Outdated Show resolved Hide resolved

cccclai force-pushed the export-D78117959 branch from 33ad9d8 to 2384c7d Compare July 11, 2025 16:57

cccclai force-pushed the export-D78117959 branch from 2384c7d to aafb8af Compare July 11, 2025 17:03

cccclai force-pushed the export-D78117959 branch from aafb8af to e94a9e1 Compare July 11, 2025 18:50

cccclai force-pushed the export-D78117959 branch from e94a9e1 to 5a3a437 Compare July 11, 2025 18:55

cccclai force-pushed the export-D78117959 branch from 5a3a437 to 111d159 Compare July 11, 2025 22:14

cccclai force-pushed the export-D78117959 branch from 111d159 to 1263030 Compare July 11, 2025 22:19

metascroy reviewed Jul 14, 2025

View reviewed changes

cccclai force-pushed the export-D78117959 branch from 1263030 to 55f7f15 Compare July 15, 2025 20:52

cccclai force-pushed the export-D78117959 branch from 55f7f15 to 3045205 Compare July 15, 2025 20:57

cccclai force-pushed the export-D78117959 branch from 3045205 to de527a2 Compare July 15, 2025 21:02

Add quantization and partitioner flow in the qualcomm doc (pytorch#12387

8485699

) Summary: Pull Request resolved: pytorch#12387 Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

cccclai force-pushed the export-D78117959 branch from de527a2 to 8485699 Compare July 15, 2025 21:07

metascroy approved these changes Jul 16, 2025

View reviewed changes

SS-JIA merged commit af07feb into pytorch:main Jul 17, 2025
99 of 103 checks passed

lucylq pushed a commit that referenced this pull request Jul 17, 2025

Add quantization and partitioner flow in the qualcomm doc (#12387)

2c21096

Summary: Add a session to describe how to lower a model to HTP, including quantization step. Differential Revision: D78117959

Add quantization and partitioner flow in the qualcomm doc #12387

Add quantization and partitioner flow in the qualcomm doc #12387

Uh oh!

Conversation

cccclai commented Jul 11, 2025

Uh oh!

pytorch-bot bot commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12387

❌ 1 New Failure, 2 Unrelated Failures

Uh oh!

facebook-github-bot commented Jul 11, 2025

Uh oh!

github-actions bot commented Jul 11, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

shewu-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

facebook-github-bot commented Jul 11, 2025

Uh oh!

facebook-github-bot commented Jul 11, 2025

Uh oh!

cccclai commented Jul 11, 2025

Uh oh!

facebook-github-bot commented Jul 11, 2025

Uh oh!

facebook-github-bot commented Jul 11, 2025

Uh oh!

metascroy Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

cccclai Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

facebook-github-bot commented Jul 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Jul 11, 2025 •

edited

Loading

This PR needs a `release notes:` label