Skip to content

Conversation

@GregoryComer
Copy link
Member

@GregoryComer GregoryComer commented Feb 9, 2025

Summary

Support U8 ops in the XNNPACK delegate by treating the input and output tensors as u8 asymmetric-quantized tensors with scale=1 and zero_point=0. This PR adds U8 support for upsample_bilinear2d, cat, slice, and _to_copy (when used to convert u8 to f32). More ops are possible with this method.

Conversion from u8 to f32 is done via transformation into a dequantize op. This is implemented in a new pass - ReplaceU8ConvertWithDqPass. The general U8 to quantized U8 transformation is done in define_tensor in node_visitor.py. U8 inputs are created as quantized tensors with the appropriate qparams (scale=1, zp=0).

Test plan

I've added op-level u8 tests to each of the new ops, as well as tests for the ReplaceU8ConvertWithDqPass. I've also added an end-to-end test for MobileNetV3 with a wrapper to take U8 inputs, resize and crop, and then convert to f32 and run the model.

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8330

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1fbaf0e with merge base d99970b (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 9, 2025
@GregoryComer GregoryComer added the release notes: xnnpack Changes to the XNNPack backend delegate label Feb 9, 2025
@GregoryComer GregoryComer force-pushed the xnn-u8 branch 2 times, most recently from 4897af3 to c5134f1 Compare February 10, 2025 02:58
@GregoryComer GregoryComer marked this pull request as ready for review February 10, 2025 04:21
@GregoryComer GregoryComer changed the title (WIP) Add XNN U8 op support via quantization Add XNN U8 op support via quantization Feb 10, 2025
@facebook-github-bot
Copy link
Contributor

@GregoryComer has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@mcr229 mcr229 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks right to me.

@digantdesai
Copy link
Contributor

Ah reviewed internally first, please look at the comments there, next time I will remember.

@github-actions
Copy link

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the stale PRs inactive for over 60 days label Aug 30, 2025
@github-actions
Copy link

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: xnnpack Changes to the XNNPack backend delegate stale PRs inactive for over 60 days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants