Qualcomm AI Engine Direct - qat proto #6222

chunit-quic · 2024-10-15T06:42:57Z

Add qat proto
Add Unit test test_qnn_backend_linear_qat
Test command

python backends/qualcomm/tests/test_qnn_delegate.py -H $HOST -s $DEVICE -b $build-android/ -m "SM8650" -r $EXECUTORCH_ROOT -k TestQNNQuantizedOperator.test_qnn_backend_linear_qat

pytorch-bot · 2024-10-15T06:43:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6222

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit dd8ace6 with merge base ad0e5e8 ():

NEW FAILURES - The following jobs have failed:

pull / test-eval_llama-mmlu-linux / linux-job (gh)
RuntimeError: Command docker exec -t 79b43d0cb5043918c582b7e464ad58c6ddae5c9748fe89d20875ea9f5b969ffe /exec failed with exit code 127
pull / test-eval_llama-wikitext-linux / linux-job (gh)
RuntimeError: Command docker exec -t 17acd430af738d9587439a5a38c3d482a13a44b9e2825866660778b2cfd59d4a /exec failed with exit code 127
pull / test-llama_runner_eager-linux / linux-job (gh)
RuntimeError: Command docker exec -t c288be6a4c728e12946161713e45f485b7c1def8a43f1fba37b6feecb56b99d3 /exec failed with exit code 127

This comment was automatically generated by Dr. CI and updates every 15 minutes.

navsud

Would it be possible to also add 16a4w qat config as well, in the same PR or as a follow-up?

navsud · 2024-10-15T16:10:17Z

backends/qualcomm/quantizer/utils.py

+        quant_max=torch.iinfo(torch.int8).max,
+        qscheme=torch.per_tensor_symmetric,
+        ch_axis=0,
+        observer_or_fake_quant_ctr=FusedMovingAvgObsFakeQuantize.with_args(observer=MovingAverageMinMaxObserver),


How about using MovingAveragePerChannelMinMaxObserver instead of MovingAverageMinMaxObserver, since we are going to do per-channel quantization for the weights?

FusedMovingAvgObsFakeQuantize.with_args(observer=MovingAveragePerChannelMinMaxObserver)

Thanks for the advice! Sorry for late reply because I was OoO.
If it's possible may we keep this for simplicity now? Currently we are working on different kind of quant confings for existing test cases. Once we finish we will update the quanziter with QAT.

chunit-quic · 2024-10-16T08:29:43Z

Would it be possible to also add 16a4w qat config as well, in the same PR or as a follow-up?
Hi @navsud,

Thank you very much for prompt reply!
Pardon that this PR is a draft to ensure that the QAT flow, like whether the unit case matches your expection. Because we have barely expreience about QAT before.

If it seems to be correct to you that will be really nice. Then we will move on to the coverage of qaunt configs.
One more question may you kindly give us some hints. Like is there any model sutible for testing the effect of QAT? Perpas one can be verify quickly?

Thank you!

Joey

navsud · 2024-10-16T16:42:06Z

The QAT unit test matches the intent.
Please update the observer to use per-channel, and after that we can go ahead with this PR. Overall, looks good to me.

For testing QAT on a real model, the easiest test is to apply QAT using the same dataset you use for PTQ calibration (e.g. Wiki) and the expectation is that loss should go down and the model metrics should be better than PTQ.

Update: You can also test the QAT flow on any open-source imagenet models (e.g. mobilenet-v2).

facebook-github-bot · 2024-10-16T22:45:22Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai

Hi @chunit-quic, should we merge this into main and iterate on this with real model? It will help internal running QAT as well.

cccclai · 2024-10-17T21:57:04Z

We just need to fix the lint error and it should be good.

chunit-quic · 2024-10-21T01:46:49Z

Hi @chunit-quic, should we merge this into main and iterate on this with real model? It will help internal running QAT as well.

Sorry for late reply. I was OoO previously .
Yes we can merge the PR, if it can help the internal iteration. I have fixed the lint errors.
We will provide a formal one recently, which includes more quantization settings

- Add qat proto - Add Unit test test_qnn_backend_linear_qat - Test command ```bash python backends/qualcomm/tests/test_qnn_delegate.py -H $HOST -s $DEVICE -b $build-android/ -m "SM8650" -r $EXECUTORCH_ROOT -k TestQNNQuantizedOperator.test_qnn_backend_linear_qat ```

facebook-github-bot · 2024-10-21T03:45:33Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2024

chiwwang mentioned this pull request Oct 15, 2024

Support QAT in QCOM qnn backend #6212

Closed

navsud reviewed Oct 15, 2024

View reviewed changes

cccclai approved these changes Oct 17, 2024

View reviewed changes

Joey Tsai added 2 commits October 21, 2024 10:20

[qat proto]

71740bb

- Add qat proto - Add Unit test test_qnn_backend_linear_qat - Test command ```bash python backends/qualcomm/tests/test_qnn_delegate.py -H $HOST -s $DEVICE -b $build-android/ -m "SM8650" -r $EXECUTORCH_ROOT -k TestQNNQuantizedOperator.test_qnn_backend_linear_qat ```

[Fix lint]

dd8ace6

chunit-quic force-pushed the dev1/chunit/qat_proto branch from 9bec383 to dd8ace6 Compare October 21, 2024 02:36

cccclai mentioned this pull request Oct 21, 2024

Qualcomm AI Engine Direct - oss model enablement (retinanet_fpn) #6321

Merged

kirklandsign marked this pull request as ready for review October 21, 2024 23:29

kirklandsign merged commit 1247545 into pytorch:main Oct 21, 2024
42 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qualcomm AI Engine Direct - qat proto #6222

Qualcomm AI Engine Direct - qat proto #6222

Uh oh!

chunit-quic commented Oct 15, 2024

Uh oh!

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

Uh oh!

navsud left a comment

Uh oh!

navsud Oct 15, 2024

Uh oh!

chunit-quic Oct 21, 2024

Uh oh!

chunit-quic commented Oct 16, 2024 •

edited

Loading

Uh oh!

navsud commented Oct 16, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

cccclai left a comment

Uh oh!

cccclai commented Oct 17, 2024

Uh oh!

chunit-quic commented Oct 21, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Oct 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Qualcomm AI Engine Direct - qat proto #6222

Qualcomm AI Engine Direct - qat proto #6222

Uh oh!

Conversation

chunit-quic commented Oct 15, 2024

Uh oh!

pytorch-bot bot commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6222

❌ 3 New Failures

Uh oh!

navsud left a comment

Choose a reason for hiding this comment

Uh oh!

navsud Oct 15, 2024

Choose a reason for hiding this comment

Uh oh!

chunit-quic Oct 21, 2024

Choose a reason for hiding this comment

Uh oh!

chunit-quic commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

navsud commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Oct 16, 2024

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

cccclai commented Oct 17, 2024

Uh oh!

chunit-quic commented Oct 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Oct 21, 2024

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

chunit-quic commented Oct 16, 2024 •

edited

Loading

navsud commented Oct 16, 2024 •

edited

Loading

chunit-quic commented Oct 21, 2024 •

edited

Loading