Skip to content

Conversation

@metascroy
Copy link
Contributor

dequantize_affine does not work correctly before iOS18, when the op gets translated to constexpr_affine_dequantize instead of constexpr_blockwise_shift_scale.

This is because the axis should be None (default value) instead of -1. In this case, the axis gets inferred from the shape of scales.

This is similar to the change in the ExecuTorch PR: pytorch/executorch#13896

@metascroy
Copy link
Contributor Author

@YifanShenSZ can you have a look?

@YifanShenSZ
Copy link
Collaborator

Hey Scott, first to confirm your use case: Do you want weight-only quantization, or activation quantization?

Some background about coreml ops that might be helpful:

  1. The constexpr_ ops are for weight-only quantization, that's why they are "const expression"s: the quantized weight are known at compile time
  2. Among const expr ops, Blockwise shift scale is new in iOS18, affine dequantize was new in iOS16
  3. Dequantize is for activation quantization, and was new in iOS17

@metascroy
Copy link
Contributor Author

@YifanShenSZ this is for weight-only quantization, which is represented as dequantize_affine (applied to the weights) followed by linear/embedding.

The existing registration works for iOS18 (blockwise scale shift), but has a bug in affine_dequantize (iOS16). The bug is the axis is incorrect. This doesn't affect iOS18 because axis is not used in blockwise scale shift.

Dequantize is for activation quantization, and was new in iOS17

I assume you mean "quantize" here is used for activation quantization?

@YifanShenSZ
Copy link
Collaborator

@YifanShenSZ this is for weight-only quantization, which is represented as dequantize_affine (applied to the weights) followed by linear/embedding.

The existing registration works for iOS18 (blockwise scale shift), but has a bug in affine_dequantize (iOS16). The bug is the axis is incorrect. This doesn't affect iOS18 because axis is not used in blockwise scale shift.

I see. So in this case, you will want

  1. For iOS 18 and higher, translate to constexpr_blockwise_shift_scale
  2. For iOS 16 and 17, translate to constexpr_affine_dequantize
  3. For iOS 15 and earlier, error out

Dequantize is for activation quantization, and was new in iOS17

I assume you mean "quantize" here is used for activation quantization?

Yeah I mean quantize and dequantize, which are for activation quantization (i.e. quantize / dequantize variables) and introduced in iOS 17

@metascroy
Copy link
Contributor Author

metascroy commented Sep 3, 2025

@YifanShenSZ this is for weight-only quantization, which is represented as dequantize_affine (applied to the weights) followed by linear/embedding.
The existing registration works for iOS18 (blockwise scale shift), but has a bug in affine_dequantize (iOS16). The bug is the axis is incorrect. This doesn't affect iOS18 because axis is not used in blockwise scale shift.

I see. So in this case, you will want

  1. For iOS 18 and higher, translate to constexpr_blockwise_shift_scale
  2. For iOS 16 and 17, translate to constexpr_affine_dequantize
  3. For iOS 15 and earlier, error out

Dequantize is for activation quantization, and was new in iOS17

I assume you mean "quantize" here is used for activation quantization?

Yeah I mean quantize and dequantize, which are for activation quantization (i.e. quantize / dequantize variables) and introduced in iOS 17

Yes, but the function _utils._construct_constexpr_dequant_op already does this translation. I'm just fixing the arg passed to it for axis (axis is only used by _utils._construct_constexpr_dequant_op before iOS18, where it translates the op to constexpr_affine_dequantize; it is not used for iOS18, where the op gets translated to constexpr_blockwise_shift_scale). The CI only tests on iOS18, so I fix the arg for axis and add a test for iOS16.

@YifanShenSZ
Copy link
Collaborator

YifanShenSZ commented Sep 4, 2025

Got it, then yes this is correct. Let's hear what CI has to say

CI 🔴 https://gitlab.com/coremltools1/coremltools/-/commit/c47235b01c27d101c373ce06b03f1758d3457557/pipelines

@pytest.mark.skipif(not _HAS_TORCHAO, reason=MSG_TORCHAO_NOT_FOUND)
@pytest.mark.parametrize(
"compute_unit, has_zeros",
itertools.product(compute_units, [True, False], [ct.target.IOS16, ct.target.IOS17]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: lower case "i", i.e. iOS

@YifanShenSZ
Copy link
Collaborator

YifanShenSZ commented Sep 5, 2025


@pytest.mark.skipif(not _HAS_TORCHAO, reason=MSG_TORCHAO_NOT_FOUND)
@pytest.mark.parametrize(
"compute_unit, has_zeros",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you might missed minimum_deployment_target?

@YifanShenSZ
Copy link
Collaborator

@metascroy please rebase on top of latest main + address comment, then let's try CI again

@YifanShenSZ
Copy link
Collaborator

CI 🔴 https://gitlab.com/coremltools1/coremltools/-/commit/30b74a15f3f3d6fef6b7ffbbfb1b79b2fcb0bfb6/pipelines

because we support torch up to 2.7.1 😮‍💨 please rebase on top of latest main to pick up #2591

@metascroy metascroy force-pushed the fix-torchao-dequant-ios16 branch from 9721717 to fb4eb52 Compare September 25, 2025 20:33
@metascroy
Copy link
Contributor Author

CI 🔴 https://gitlab.com/coremltools1/coremltools/-/commit/30b74a15f3f3d6fef6b7ffbbfb1b79b2fcb0bfb6/pipelines

because we support torch up to 2.7.1 😮‍💨 please rebase on top of latest main to pick up #2591

rebased

@metascroy
Copy link
Contributor Author

@YifanShenSZ can this be merged?

@YifanShenSZ
Copy link
Collaborator

YifanShenSZ commented Dec 2, 2025

@YifanShenSZ
Copy link
Collaborator

CI 🔴 due to the typo IOS -> iOS

    class TestPyTorchQuantizationOps(TorchQuantizationBaseTest):
../../envs/coremltools-py3.10/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/test/test_torch_quantization_ops.py:278: in TestPyTorchQuantizationOps
    itertools.product(compute_units, [True, False], [ct.target.IOS16, ct.target.IOS17]),
../../envs/coremltools-py3.10/lib/python3.10/enum.py:437: in __getattr__
    raise AttributeError(name) from None
E   AttributeError: IOS16. Did you mean: 'iOS16'?

@metascroy
Copy link
Contributor Author

iOS16

Updated

@YifanShenSZ
Copy link
Collaborator

YifanShenSZ commented Dec 2, 2025

Copy link
Collaborator

@YifanShenSZ YifanShenSZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for your contribution to coremltools! 🙏

@YifanShenSZ YifanShenSZ merged commit 145e3be into apple:main Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants