Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

jerryzh168 · 2025-08-11T20:36:52Z

Stacked PRs:

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

…the Float8Tensor Summary: similar to #2687, we updated Int4PreshuffledTensor to align the implementation details, also used TorchAOBaseTensor to simplify some of the implementations Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR Test Plan: python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2738, branch: jerryzh168/stack/20

pytorch-bot · 2025-08-11T20:36:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2738

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 12 Pending

As of commit 1073ced with merge base 7c13cde ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…the Float8Tensor Summary: similar to #2687, we updated Int4PreshuffledTensor to align the implementation details, also used TorchAOBaseTensor to simplify some of the implementations Test Plan: python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2738, branch: jerryzh168/stack/20

…the Float8Tensor Summary: similar to #2687, we updated Int4PreshuffledTensor to align the implementation details, also used TorchAOBaseTensor to simplify some of the implementations Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR Test Plan: python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py Reviewers: Subscribers: Tasks: Tags: stack-info: PR: #2738, branch: jerryzh168/stack/20

jerryzh168 · 2025-08-12T16:58:41Z

torchao/quantization/quantize_/workflows/int4/int4_preshuffled_tensor.py

        group_zero = weight_tensor.group_zero.contiguous()
        res = torch.ops.fbgemm.bf16i4bf16_shuffled(
            input_tensor, wq, group_scale, group_zero
        )
    else:
+        # dynamically quantizes activation to fp8
        assert weight_tensor.row_scale is not None
        row_scale = weight_tensor.row_scale.contiguous()
        xq, x_scale = quantize_fp8_row(input_tensor)


btw @vkuzo, we didn't use _choose_quant_func_and_quantize_tensor and float8 quant args yet, but we can add that if there are multiple choices of float8 activation quant in the future. please let me know if you have different thoughts here.

jerryzh168 force-pushed the jerryzh168/stack/20 branch from b45aca0 to 584723f Compare August 11, 2025 20:37

This was referenced Aug 11, 2025

Align Int4Tensor implementation details with the design of Float8Tensor #2687

Merged

Support optional_tensor_names in TorchAOBaseTensor #2710

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 11, 2025

jerryzh168 added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Aug 11, 2025

jerryzh168 requested review from vkuzo and drisspg August 11, 2025 20:41

jerryzh168 mentioned this pull request Aug 11, 2025

Add IntxUnpackedTensor #2732

Open

jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 02:49

jerryzh168 force-pushed the jerryzh168/stack/20 branch from 584723f to 9b72101 Compare August 12, 2025 02:49

jerryzh168 changed the base branch from main to jerryzh168/stack/17 August 12, 2025 02:50

jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 02:52

jerryzh168 force-pushed the jerryzh168/stack/20 branch from 9b72101 to 986ce18 Compare August 12, 2025 02:52

jerryzh168 changed the base branch from main to jerryzh168/stack/17 August 12, 2025 02:52

vkuzo approved these changes Aug 12, 2025

View reviewed changes

jerryzh168 force-pushed the jerryzh168/stack/17 branch from 4edb547 to 2005be0 Compare August 12, 2025 16:57

jerryzh168 force-pushed the jerryzh168/stack/20 branch from 986ce18 to c0b8f24 Compare August 12, 2025 16:57

jerryzh168 commented Aug 12, 2025

View reviewed changes

jerryzh168 force-pushed the jerryzh168/stack/20 branch from c0b8f24 to 1073ced Compare August 12, 2025 16:59

jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 16:59

jerryzh168 merged commit 4fe5ec6 into main Aug 12, 2025
13 of 18 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

jerryzh168 commented Aug 11, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 11, 2025 •

edited

Loading

Uh oh!

jerryzh168 Aug 12, 2025

Uh oh!

Uh oh!

Uh oh!

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

Conversation

jerryzh168 commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!