Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010

RahulC7 · 2025-10-10T18:34:28Z

Summary:

Context

We currently only support 8-bit for most operators. We would like to add generic ops for 16-bit activations, for the following ops:

quantized_fully_connected
quantized_linear
quantized_conv (all flavors)
quantized_matmul

This Diff

Here, we add support for quantized_linear and quantized_fully_connected. We need to do the following:

Allow 16-bit activations in quantized_fully_connected_out.cpp and quantized_linear_out.cpp.
Allow 16-bit activations in ref_implementations.py, so tests can run with 16-bit activations to validate the quantization is correct.
Add a quantizer(CadenceWith16BitLinearActivationsQuantizer) for checking this works and create a unit test.

Differential Revision: D84284794

meta-codesync · 2025-10-10T18:34:39Z

@RahulC7 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D84284794.

github-actions · 2025-10-10T18:35:19Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

pytorch-bot · 2025-10-10T19:01:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15010

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 388201c with merge base a12219d ():

NEW FAILURES - The following jobs have failed:

pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 31314cecc9534dad8f4f834fe21038ce04bccfb9a58186e5d8f1b9a3191c0589 /exec failed with exit code 1
Test CUDA Builds / export-voxtral-cuda-artifact / linux-job (gh)
RuntimeError: Command docker exec -t 9b9934ca3ceb4a204a6184927133162750dab52ec05e67c0d989a23f9afbca36 /exec failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…d linear (pytorch#15010) Summary: # Context We currently only support 8-bit for most operators. We would like to add generic ops for 16-bit activations, for the following ops: - quantized_fully_connected - quantized_linear - quantized_conv (all flavors) - quantized_matmul # This Diff Here, we add support for `quantized_linear` and `quantized_fully_connected`. We need to do the following: 1. Allow 16-bit activations in `quantized_fully_connected_out.cpp` and `quantized_linear_out.cpp`. 2. Allow 16-bit activations in `ref_implementations.py`, so tests can run with 16-bit activations to validate the quantization is correct. 3. Add a quantizer(`CadenceWith16BitLinearActivationsQuantizer`) for checking this works and create a unit test. Reviewed By: hsharma35 Differential Revision: D84284794

…d linear (pytorch#15010) Summary: # Context We currently only support 8-bit for most operators. We would like to add generic ops for 16-bit activations, for the following ops: - quantized_fully_connected - quantized_linear - quantized_conv (all flavors) - quantized_matmul # This Diff Here, we add support for `quantized_linear` and `quantized_fully_connected`. We need to do the following: 1. Allow 16-bit activations in `quantized_fully_connected_out.cpp` and `quantized_linear_out.cpp`. 2. Allow 16-bit activations in `ref_implementations.py`, so tests can run with 16-bit activations to validate the quantization is correct. 3. Add a quantizer(`CadenceWith16BitLinearActivationsQuantizer`) for checking this works and create a unit test. Reviewed By: DrJessop, hsharma35 Differential Revision: D84284794

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 10, 2025

meta-codesync bot added fb-exported meta-exported labels Oct 10, 2025

RahulC7 force-pushed the export-D84284794 branch from ee7a0f7 to 408cbcd Compare October 13, 2025 22:27

DrJessop approved these changes Oct 14, 2025

View reviewed changes

RahulC7 force-pushed the export-D84284794 branch from 408cbcd to 388201c Compare October 14, 2025 20:07

meta-codesync bot merged commit 926312e into pytorch:main Oct 14, 2025
223 of 227 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010

Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010

Uh oh!

RahulC7 commented Oct 10, 2025

Uh oh!

meta-codesync bot commented Oct 10, 2025

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

pytorch-bot bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010

Enable 16-bit activations in Cadence Quantizer For fully_connected and linear #15010

Uh oh!

Conversation

RahulC7 commented Oct 10, 2025

Context

This Diff

Uh oh!

meta-codesync bot commented Oct 10, 2025

Uh oh!

github-actions bot commented Oct 10, 2025

This PR needs a release notes: label

Uh oh!

pytorch-bot bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15010

❌ 2 New Failures

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

This PR needs a `release notes:` label

pytorch-bot bot commented Oct 10, 2025 •

edited

Loading