Skip to content

fix: ND input quantization#4210

Open
narendasan wants to merge 1 commit intomainfrom
narendasan/nd_block_quantization
Open

fix: ND input quantization#4210
narendasan wants to merge 1 commit intomainfrom
narendasan/nd_block_quantization

Conversation

@narendasan
Copy link
Copy Markdown
Collaborator

Description

Flattens ND quantization inputs to 2D

Fixes #4201

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@meta-cla meta-cla Bot added the cla signed label Apr 23, 2026
@github-actions github-actions Bot added component: tests Issues re: Tests component: conversion Issues re: Conversion stage component: core Issues re: The core compiler component: converters Issues re: Specific op converters component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Apr 23, 2026
@github-actions github-actions Bot requested a review from zewenli98 April 23, 2026 17:31
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_models_export.py	2026-04-23 17:31:15.121758+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_models_export.py	2026-04-23 17:31:41.297024+00:00
@@ -409,13 +409,13 @@
                use_explicit_typing=True,
            )
            expected = model(input_tensor)
            outputs_trt = trt_model(input_tensor)
            abs_diff = torch.abs(expected - outputs_trt)
-            assert torch.allclose(expected, outputs_trt, rtol=0.3, atol=0.3), (
-                f"max abs diff: {abs_diff.max().item()}"
-            )
+            assert torch.allclose(
+                expected, outputs_trt, rtol=0.3, atol=0.3
+            ), f"max abs diff: {abs_diff.max().item()}"


@unittest.skipIf(
    torch.cuda.get_device_capability() < (8, 9),
    "FP8 quantization requires compute capability 8.9 or later",
@@ -472,13 +472,13 @@
                use_explicit_typing=True,
            )
            expected = model(input_tensor)
            outputs_trt = trt_model(input_tensor)
            abs_diff = torch.abs(expected - outputs_trt)
-            assert torch.allclose(expected, outputs_trt, rtol=5e-2, atol=5e-2), (
-                f"max abs diff: {abs_diff.max().item()}"
-            )
+            assert torch.allclose(
+                expected, outputs_trt, rtol=5e-2, atol=5e-2
+            ), f"max abs diff: {abs_diff.max().item()}"


@unittest.skipIf(
    torch.cuda.get_device_capability() < (8, 9),
    "FP8 quantization requires compute capability 8.9 or later",

@narendasan narendasan force-pushed the narendasan/nd_block_quantization branch from b616c6a to 75280f8 Compare April 23, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: tests Issues re: Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🐛 [Bug] dynamic_block_quantize converter received an input of (11520, 3, 2, 16, 16) shape. Supported shapes: 2D or 3D

1 participant