-
Notifications
You must be signed in to change notification settings - Fork 743
Fix 8w8a qat qconfig setting activations #13284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13284
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New Failures, 5 Unrelated FailuresAs of commit 9f528a0 with merge base 310a05d ( NEW FAILURES - The following jobs have failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D80007226 |
Summary: 8-bit activation qconfig should not use reduce_range=True which limits the range to 0,127. This diff fixes that issue. Differential Revision: D80007226
99b8017 to
a48263f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D80007226 |
What range should it be instead? |
Without |
cccclai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix!
a48263f to
354c03a
Compare
Summary: 8-bit activation qconfig should not use reduce_range=True which limits the range to 0,127. This diff fixes that issue. Reviewed By: cccclai Differential Revision: D80007226
Summary: Pull Request resolved: pytorch#13284 8-bit activation qconfig should not use reduce_range=True which limits the range to 0,127. This diff fixes that issue. Reviewed By: cccclai Differential Revision: D80007226
|
This pull request was exported from Phabricator. Differential Revision: D80007226 |
354c03a to
9f528a0
Compare
Differential Revision: D80007226 Pull Request resolved: pytorch#13284
|
I found the weight use reduce_range too.. weight_fake_quant_ctr = FusedMovingAvgObsFakeQuantize.with_args(
dtype=torch.int8,
quant_min=torch.iinfo(torch.int8).min + 1,
quant_max=torch.iinfo(torch.int8).max,
qscheme=torch.per_tensor_symmetric,
reduce_range=True,
observer=MovingAverageMinMaxObserver,
)I think if modify it to False will improve performance too... |
cc: @haowhsu-quic @winskuo-quic @shewu-quic @DannyYuyang-quic |
|
@haowhsu-quic |
Summary: 8-bit activation qconfig should not use reduce_range=True which limits the range to 0,127. This diff fixes that issue.
Differential Revision: D80007226