Skip to content

Conversation

@chunit-quic
Copy link
Contributor

  • Add qat proto
  • Add Unit test test_qnn_backend_linear_qat
  • Test command
python backends/qualcomm/tests/test_qnn_delegate.py -H $HOST -s $DEVICE -b $build-android/ -m "SM8650" -r $EXECUTORCH_ROOT -k TestQNNQuantizedOperator.test_qnn_backend_linear_qat

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6222

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures

As of commit dd8ace6 with merge base ad0e5e8 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2024
Copy link
Contributor

@navsud navsud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to also add 16a4w qat config as well, in the same PR or as a follow-up?

quant_max=torch.iinfo(torch.int8).max,
qscheme=torch.per_tensor_symmetric,
ch_axis=0,
observer_or_fake_quant_ctr=FusedMovingAvgObsFakeQuantize.with_args(observer=MovingAverageMinMaxObserver),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using MovingAveragePerChannelMinMaxObserver instead of MovingAverageMinMaxObserver, since we are going to do per-channel quantization for the weights?

FusedMovingAvgObsFakeQuantize.with_args(observer=MovingAveragePerChannelMinMaxObserver)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice! Sorry for late reply because I was OoO.
If it's possible may we keep this for simplicity now? Currently we are working on different kind of quant confings for existing test cases. Once we finish we will update the quanziter with QAT.

@chunit-quic
Copy link
Contributor Author

chunit-quic commented Oct 16, 2024

Would it be possible to also add 16a4w qat config as well, in the same PR or as a follow-up?
Hi @navsud,

Thank you very much for prompt reply!
Pardon that this PR is a draft to ensure that the QAT flow, like whether the unit case matches your expection. Because we have barely expreience about QAT before.

If it seems to be correct to you that will be really nice. Then we will move on to the coverage of qaunt configs.
One more question may you kindly give us some hints. Like is there any model sutible for testing the effect of QAT? Perpas one can be verify quickly?

Thank you!

Joey

@navsud
Copy link
Contributor

navsud commented Oct 16, 2024

The QAT unit test matches the intent.
Please update the observer to use per-channel, and after that we can go ahead with this PR. Overall, looks good to me.

For testing QAT on a real model, the easiest test is to apply QAT using the same dataset you use for PTQ calibration (e.g. Wiki) and the expectation is that loss should go down and the model metrics should be better than PTQ.

Update: You can also test the QAT flow on any open-source imagenet models (e.g. mobilenet-v2).

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chunit-quic, should we merge this into main and iterate on this with real model? It will help internal running QAT as well.

@cccclai
Copy link
Contributor

cccclai commented Oct 17, 2024

We just need to fix the lint error and it should be good.

@chunit-quic
Copy link
Contributor Author

chunit-quic commented Oct 21, 2024

Hi @chunit-quic, should we merge this into main and iterate on this with real model? It will help internal running QAT as well.

Sorry for late reply. I was OoO previously .
Yes we can merge the PR, if it can help the internal iteration. I have fixed the lint errors.
We will provide a formal one recently, which includes more quantization settings

Joey Tsai added 2 commits October 21, 2024 10:20
- Add qat proto
- Add Unit test test_qnn_backend_linear_qat
- Test command
```bash
python backends/qualcomm/tests/test_qnn_delegate.py -H $HOST -s $DEVICE -b $build-android/ -m "SM8650" -r $EXECUTORCH_ROOT -k TestQNNQuantizedOperator.test_qnn_backend_linear_qat
```
@chunit-quic chunit-quic force-pushed the dev1/chunit/qat_proto branch from 9bec383 to dd8ace6 Compare October 21, 2024 02:36
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kirklandsign kirklandsign marked this pull request as ready for review October 21, 2024 23:29
@kirklandsign kirklandsign merged commit 1247545 into pytorch:main Oct 21, 2024
42 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants