Skip to content

Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor #2738

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 12, 2025

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Aug 11, 2025

Stacked PRs:


Update Int4PreshuffledTensor to align with implementation details of the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

…the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2738, branch: jerryzh168/stack/20
Copy link

pytorch-bot bot commented Aug 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2738

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 12 Pending

As of commit 1073ced with merge base 7c13cde (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 added a commit that referenced this pull request Aug 11, 2025
…the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2738, branch: jerryzh168/stack/20
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/20 branch from b45aca0 to 584723f Compare August 11, 2025 20:37
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 11, 2025
@jerryzh168 jerryzh168 added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Aug 11, 2025
@jerryzh168 jerryzh168 requested review from vkuzo and drisspg August 11, 2025 20:41
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 02:49
jerryzh168 added a commit that referenced this pull request Aug 12, 2025
…the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2738, branch: jerryzh168/stack/20
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/20 branch from 584723f to 9b72101 Compare August 12, 2025 02:49
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/17 August 12, 2025 02:50
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 02:52
jerryzh168 added a commit that referenced this pull request Aug 12, 2025
…the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2738, branch: jerryzh168/stack/20
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/20 branch from 9b72101 to 986ce18 Compare August 12, 2025 02:52
@jerryzh168 jerryzh168 changed the base branch from main to jerryzh168/stack/17 August 12, 2025 02:52
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/17 branch from 4edb547 to 2005be0 Compare August 12, 2025 16:57
jerryzh168 added a commit that referenced this pull request Aug 12, 2025
…the Float8Tensor

Summary:
similar to #2687, we updated Int4PreshuffledTensor to align
the implementation details, also used TorchAOBaseTensor to simplify some of the implementations

Note: This is just refactoring Int4PreshuffledTensor, no BC related changes in this PR

Test Plan:
python test/quantization/quantize_/workflows/int4/test_int4_preshuffled_tensor.py

Reviewers:

Subscribers:

Tasks:

Tags:

stack-info: PR: #2738, branch: jerryzh168/stack/20
@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/20 branch from 986ce18 to c0b8f24 Compare August 12, 2025 16:57
group_zero = weight_tensor.group_zero.contiguous()
res = torch.ops.fbgemm.bf16i4bf16_shuffled(
input_tensor, wq, group_scale, group_zero
)
else:
# dynamically quantizes activation to fp8
assert weight_tensor.row_scale is not None
row_scale = weight_tensor.row_scale.contiguous()
xq, x_scale = quantize_fp8_row(input_tensor)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw @vkuzo, we didn't use _choose_quant_func_and_quantize_tensor and float8 quant args yet, but we can add that if there are multiple choices of float8 activation quant in the future. please let me know if you have different thoughts here.

@jerryzh168 jerryzh168 force-pushed the jerryzh168/stack/20 branch from c0b8f24 to 1073ced Compare August 12, 2025 16:59
@jerryzh168 jerryzh168 changed the base branch from jerryzh168/stack/17 to main August 12, 2025 16:59
@jerryzh168 jerryzh168 merged commit 4fe5ec6 into main Aug 12, 2025
13 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing Use this tag if you don't want this PR to show up in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants