[ET-VK][Ops] linear_qta8a_qga4w_qta8o test framework #12375

pytorchbot · 2025-07-10T22:46:30Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #12005 by @ahmtox
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/ahmtox/24/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/24/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/ahmtox/24/orig
@diff-train-skip-merge

Pull Request resolved: #12005 # Context This test framework establishes the foundation for validating the `linear_qta8a_qga4w` operator implementation as part of enabling dynamic quantization. The motivation stems from advancing beyond weight-only quantization to full activation and weight quantized linear operations, enabling true integer arithmetic throughout the matrix multiplication process for improved performance on GPU hardware. The current weight-only quantized linear implementations in ET-VK dequantize weights to floating point before computation, missing the performance benefits of integer arithmetic. This operator nomenclature breakdown: - **qta8a**: Quantized per-token affine 8-bit activation inputs - **qga4w**: Quantized per-group affine 4-bit weights # Changes The reference implementation (`linear_qta8a_qga4w_4bit_dequant_impl`) provides a baseline for validating the GPU shader implementation through a deliberately simplified computation path. The quantized int8 input tensor is dequantized using the standard affine transformation `(quantized_input.to(at::kFloat) - input_zero_point) * input_scale`. After dequantization, the implementation performs standard floating point linear operation `at::linear(x_float, weights_dequantized)`. This two-stage approach of dequantize → compute provides a clear reference against which the GPU's integer arithmetic implementation can be validated. ghstack-source-id: 295393632 @exported-using-ghexport Differential Revision: [D77173442](https://our.internmc.facebook.com/intern/diff/D77173442/)

pytorch-bot · 2025-07-10T22:46:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12375

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job, 4 Unrelated Failures

As of commit a337a5c with merge base b5f950b ():

CANCELLED JOB - The following job was cancelled. Please retry:

pull / test-models-linux (linear, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
devtools/inspector/tests/inspector_utils_test.py::TestInspectorUtils::test_equip_debug_handle_to_export_program_success
pull / unittest / macos / macos-job (gh) (trunk failure)
devtools/inspector/tests/inspector_utils_test.py::TestInspectorUtils::test_equip_debug_handle_to_export_program_success
pull / unittest-editable / linux / linux-job (gh) (trunk failure)
devtools/inspector/tests/inspector_utils_test.py::TestInspectorUtils::test_equip_debug_handle_to_export_program_success
pull / unittest-editable / macos / macos-job (gh) (trunk failure)
devtools/inspector/tests/inspector_utils_test.py::TestInspectorUtils::test_equip_debug_handle_to_export_program_success

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-07-10T22:47:09Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

pytorchbot requested a review from SS-JIA as a code owner July 10, 2025 22:46

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 10, 2025

Merge branch 'main' into gh/ahmtox/24/orig

a337a5c

ahmtox self-requested a review July 11, 2025 00:46

ahmtox approved these changes Jul 11, 2025

View reviewed changes

ahmtox merged commit fc435fa into main Jul 11, 2025
92 of 97 checks passed

ahmtox deleted the gh/ahmtox/24/orig branch July 11, 2025 02:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK][Ops] linear_qta8a_qga4w_qta8o test framework #12375

[ET-VK][Ops] linear_qta8a_qga4w_qta8o test framework #12375

Uh oh!

pytorchbot commented Jul 10, 2025

Uh oh!

pytorch-bot bot commented Jul 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ET-VK][Ops] linear_qta8a_qga4w_qta8o test framework #12375

[ET-VK][Ops] linear_qta8a_qga4w_qta8o test framework #12375

Uh oh!

Conversation

pytorchbot commented Jul 10, 2025

Uh oh!

pytorch-bot bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12375

❌ 1 Cancelled Job, 4 Unrelated Failures

Uh oh!

github-actions bot commented Jul 10, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Jul 10, 2025 •

edited

Loading

This PR needs a `release notes:` label