Add IntxUnpackedTensor #2732

metascroy · 2025-08-11T16:57:22Z

This adds IntxUnpackedTensor, where subbyte quantized data is represented as int8. The range of the quantized values are restricted to the quant_min and quant_max of the target_dtype, e.g., if target_dtype=torch.int4, qdata will be an int8 tensor with values in [-8, 7]. Quantization is represented in a decomposed way.

This tensor is intended for export use cases that currently use AQT with QDQLayout.

The test plan are the new unit tests.

pytorch-bot · 2025-08-11T16:57:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2732

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f6c9d09 with merge base e6b38bb ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-08-11T20:47:19Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        block_size: the block size for quantization, representing the granularity, for example groupwise quantization will have block_size (1, group_size)
+    """
+
+    tensor_data_attrs = ["int_data", "scale", "zero_point"]


btw if you update these to tensor_data_names and tensor_attribute_names you'll be able to remove some of the implementations, see docs in https://github.com/pytorch/ao/pull/2710/files#diff-d2a11602a79e83305208472f1abe6a4106f02ce62a7f9524007181813863fcf6R687, example: #2738

I can still override the behavior in TorchAOBaseTensor, right?

For example, it looks like aten._to_copy.default gets auto-populated, but I want to define its dtype variant in addition to device variant.

this should be working, I haven't actively tested this behavior though, I'll try to add a test for this

jerryzh168 · 2025-08-11T20:56:54Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        )
+
+    @classmethod
+    def from_float(


nit: we are standardizing on from_hp now

What does hp stand for?

high precision

torchao/quantization/quantize_/common/packing_format.py

jerryzh168 · 2025-08-11T21:00:01Z

torchao/quantization/quant_api.py

@@ -2060,6 +2061,8 @@ class IntxWeightOnlyConfig(AOBaseConfig):
    mapping_type: MappingType = MappingType.SYMMETRIC
    scale_dtype: Optional[torch.dtype] = None
    layout: Layout = QDQLayout()
+    packing_format: PackingFormat = PackingFormat.UNPACKED
+    VERSION: int = 1


nit: we updated the name to version

metascroy · 2025-08-13T16:43:43Z

Any more concerns here @jerryzh168?

jerryzh168 · 2025-08-15T22:50:35Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+    This format is inteded for torch.export use cases.
+
+    Tensor Attributes:
+        int_data: int data for quantization.


nit: use qdata to align with other tensors

jerryzh168 · 2025-08-15T22:51:16Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+            block_size=block_size,
+        )
+
+    def get_plain(self):


nit: no longer need this I think

jerryzh168 · 2025-08-15T22:52:10Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+    @classmethod
+    def from_hp(
+        cls,
+        float_tensor: torch.Tensor,


nit: use hp_tensor to align with the method name

jerryzh168 · 2025-08-15T22:52:22Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        cls,
+        float_tensor: torch.Tensor,
+        block_size: Tuple[int],
+        dtype: torch.dtype,


nit: rename to target_dtype for more clarity

jerryzh168 · 2025-08-15T22:53:50Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+
+class IntxUnpackedTensor(TorchAOBaseTensor):
+    """
+    intx quantization with unpacked format.  Subbyte quantized data is represented as int8.


nit: to make it clearer, we can add a bit more description here about subbyte quantized data I think, we should mention the range of the quantized values are restricted to the quant_min and quant_max of the target bit width, e.g. for uint4, the values falls into range of 0 and 15

jerryzh168 · 2025-08-15T22:54:54Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        block_size: Optional[Tuple[int]] = None,
+    ):
+        # Check plain data and infer block_size from shapes
+        if block_size is None:


would it be easier just to make block_size required? when is block_size None?

Removed. I did use it in the slice implementation, but I just added logic inside slice to recompute the block size.

jerryzh168 · 2025-08-15T22:55:43Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        self.bit_width = bit_width
+        self.block_size = block_size
+
+    def __repr__(self):


repr is also implemented by default in TorchAOBaseTensor when you define tensor_data_names and tensor_attribute_names btw

jerryzh168 · 2025-08-15T22:56:37Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+        device = kwargs.pop("device")
+        dtype = kwargs.pop("dtype")
+        assert dtype in _FLOAT_TYPES
+        return self.__class__(


nit: self.__class__ --> IntxUnpackedTensor to reduce runtime check and align with other code

jerryzh168 · 2025-08-15T22:57:19Z

torchao/quantization/quantize_/workflows/intx/intx_unpacked_tensor.py

+    scale = aten.slice.Tensor(self.scale, dim, start_scale, end_scale, step)
+    zero_point = aten.slice.Tensor(self.zero_point, dim, start_scale, end_scale, step)
+
+    new = self.__class__(


jerryzh168

LG, please add a bit details in PR summary to explain the context for the change and Test Plan as well.

* add intx unpacked tensor * up * up * up * up * up

metascroy requested a review from jerryzh168 August 11, 2025 16:57

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 11, 2025

jerryzh168 reviewed Aug 11, 2025

View reviewed changes

torchao/quantization/quantize_/common/packing_format.py Show resolved Hide resolved

jerryzh168 reviewed Aug 11, 2025

View reviewed changes

metascroy added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Aug 12, 2025

jerryzh168 reviewed Aug 15, 2025

View reviewed changes

metascroy added 5 commits August 17, 2025 17:02

add intx unpacked tensor

c9eb459

up

ad22739

up

5e07eb3

up

bd29f1c

up

143fe91

metascroy force-pushed the intx-awq-removal branch from 93948a4 to 143fe91 Compare August 18, 2025 00:02

up

f6c9d09

jerryzh168 approved these changes Aug 19, 2025

View reviewed changes

metascroy merged commit 72b35bf into main Aug 19, 2025
18 checks passed

jerryzh168 mentioned this pull request Aug 13, 2025

Migrating from AffineQuantizedTensor + Layouts to new structure of tensor subclasses #2752

Open

17 tasks

liangel-02 pushed a commit that referenced this pull request Aug 25, 2025

Add IntxUnpackedTensor (#2732)

8e88f26

* add intx unpacked tensor * up * up * up * up * up

Add IntxUnpackedTensor #2732

Add IntxUnpackedTensor #2732

Uh oh!

Conversation

metascroy commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2732

✅ No Failures

Uh oh!

jerryzh168 Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metascroy commented Aug 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerryzh168 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

metascroy commented Aug 11, 2025 •

edited

Loading

pytorch-bot bot commented Aug 11, 2025 •

edited

Loading

jerryzh168 Aug 11, 2025 •

edited

Loading

jerryzh168 Aug 15, 2025 •

edited

Loading

jerryzh168 Aug 15, 2025 •

edited

Loading