-
Notifications
You must be signed in to change notification settings - Fork 695
Arm backend: Make per-channel quantization default #11873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arm backend: Make per-channel quantization default #11873
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11873
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit cfaa79f with merge base 2bd96df ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot label ciflow/trunk |
|
@pytorchbot label "partner: arm" |
|
@pytorchbot label "topic: not user facing" |
|
@martinlsm no need to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume rationale is accuracy but can you please add more details? And linear and conv weights only right? perf impact? I also didn't see any ATOL/RTOL change in this diff.
| "# Create and configure quantizer to use a symmetric quantization config globally on all nodes\n", | ||
| "quantizer = EthosUQuantizer(compile_spec)\n", | ||
| "operator_config = get_symmetric_quantization_config(is_per_channel=False)\n", | ||
| "operator_config = get_symmetric_quantization_config(is_per_channel=True)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
| "operator_config = get_symmetric_quantization_config(is_per_channel=True)\n", | |
| "operator_config = get_symmetric_quantization_config()\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@digantdesai I have resolved your code comment and answered all your questions in the updated commit message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
8175bb3 to
c3567b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am glad to see we didn't have to mess with the backend code at all when quantizing the weights differently. Thanks @martinlsm.
@digantdesai we did mess with the backend code but in a separate PR #11752. |
fb40b0a to
4dd805c
Compare
Support for per-channel quantization was recently added to the Arm backend. This patch changes the default setting to use per-channel quantization for weights in convolutional and linear layers, instead of per-tensor quantization, which was the previous default. The reason for this change is that per-channel quantization offers better numerical accuracy for models containing convolutional and/or fully connected layers. Unless there is an explicit limitation in the use case that prevents the use of per-channel quantization, it is generally preferred. The option to set quantization granularity can still be manually set using `get_symmetric_quantization_config(is_per_channel=False)`. This patch only changes the default. Unit and model tests are affected by this change. Error tolerances for those tests have not been changed, as model outputs are compared against a reference that uses the exact same quantization strategy. That is, if a model output is altered by this patch, the reference it is compared against would also be altered accordingly. To verify the impact of this change in terms of top-1 and top-5 accuracy, a manual test was run on MobileNetV2. The results show a noticeable improvement: - Per-tensor quantization Top-1 / Top-5 accuracy: 66.45% / 87.50% - Per-channel quantization Top-1 / Top-5 accuracy: 70.85% / 89.50% Change-Id: I35d5c62741c7f93b916560874689245db96a588b Signed-off-by: Martin Lindström <[email protected]>
Previously we were just a few minutes off the 90 minute timeout. With per-channel quantizaiton enabled by defualt it seems that we exceed that limit consistently. This patch increases the timeout to 120 minutes. Change-Id: I20f3fb369329dd51e95ffec667617afe93c50aa3 Signed-off-by: Oscar Andersson <[email protected]>
4dd805c to
07c2ba7
Compare
|
LOL That's what I thought :D Are we ready to merge this one then? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated CI failures
Support for per-channel quantization was recently added to the Arm
backend. This patch changes the default setting to use per-channel
quantization for weights in convolutional and linear layers, instead of
per-tensor quantization, which was the previous default.
The reason for this change is that per-channel quantization offers
better numerical accuracy for models containing convolutional and/or
fully connected layers. Unless there is an explicit limitation in the
use case that prevents the use of per-channel quantization, it is
generally preferred.
The option to set quantization granularity can still be manually set
using
get_symmetric_quantization_config(is_per_channel=False). Thispatch only changes the default.
Unit and model tests are affected by this change. Error tolerances for
those tests have not been changed, as model outputs are compared against
a reference that uses the exact same quantization strategy. That is, if
a model output is altered by this patch, the reference it is compared
against would also be altered accordingly.
To verify the impact of this change in terms of top-1 and top-5
accuracy, a manual test was run on MobileNetV2. The results show a
noticeable improvement:
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218