Add fp4 support(WIP) #3496

lanluo-nvidia · 2025-04-29T16:37:32Z

Description

Add fp4 support

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/_enums.py	2025-04-29 16:37:46.596096+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/_enums.py	2025-04-29 16:38:07.872645+00:00
@@ -77,11 +77,11 @@
    f8 = auto()
    """8 bit floating-point number, equivalent to ``dtype.fp8`` and ``dtype.float8``
    
    :meta hide-value:
    """
-    
+
    f4 = auto()
    """4 bit floating-point number, equivalent to ``dtype.fp4`` and ``dtype.float4``

    :meta hide-value:
    """
--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/impl/quantize.py	2025-04-29 16:37:46.600096+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/conversion/impl/quantize.py	2025-04-29 16:38:08.284218+00:00
@@ -66,10 +66,11 @@
            dequantize_layer.precision = trt.DataType.FP8
        dq_output = dequantize_layer.get_output(0)

        return dq_output

+
def dynamic_block_quantize(
    ctx: ConversionContext,
    target: Target,
    source_ir: Optional[SourceIR],
    name: str,
@@ -97,19 +98,23 @@
            )
        if len(input_tensor.shape) not in (2, 3):
            raise ValueError(
                f"dynamic_block_quantize converter received an input of {input_tensor.shape} shape. Supported shapes: 2D or 3D"
            )
-        print(f"input_tensor.shape: {input_tensor.shape} {block_size=} {amax=} {num_bits=} {exponent_bits=} {scale_num_bits=} {scale_exponent_bits=}")
+        print(
+            f"input_tensor.shape: {input_tensor.shape} {block_size=} {amax=} {num_bits=} {exponent_bits=} {scale_num_bits=} {scale_exponent_bits=}"
+        )
        max_bound = 6
        amax = to_torch(amax, None)
        scale = torch.divide(amax, max_bound)
        scale = get_trt_tensor(ctx, scale, name + "_scale")

-        output_type=trt.DataType.FP4
+        output_type = trt.DataType.FP4
        # Add Q node
-        dynamic_quantize_layer = ctx.net.add_dynamic_quantize(input_tensor, axis=-1, block_size=16, output_type=output_type)
+        dynamic_quantize_layer = ctx.net.add_dynamic_quantize(
+            input_tensor, axis=-1, block_size=16, output_type=output_type
+        )
        quantize_layer.set_output_type(0, output_type)

        set_layer_name(quantize_layer, target, name + "_quantize", source_ir)
        q_output = quantize_layer.get_output(0)
        # Add DQ node
--- /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_models_export.py	2025-04-29 16:37:46.627096+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/models/test_models_export.py	2025-04-29 16:38:13.681784+00:00
@@ -195,11 +195,10 @@
        msg=f"Resnet18 Half TRT outputs don't match with the original model. Cosine sim score: {cos_sim} Threshold: {COSINE_THRESHOLD}",
    )

    # Clean up model env
    torch._dynamo.reset()
-


@unittest.skipIf(
    torch.cuda.get_device_capability() < (8, 9),
    "FP4 quantization requires compute capability 8.9 or later",

narendasan · 2025-04-29T17:25:59Z

Removed the image so we arent leaking internal info

Add fp4 support

1d172ce

facebook-github-bot added the cla signed label Apr 29, 2025

github-actions bot requested a review from peri044 April 29, 2025 16:37

github-actions bot requested changes Apr 29, 2025

View reviewed changes

lanluo-nvidia added 3 commits April 30, 2025 10:01

test

d2b1422

Merge branch 'main' into lluo/fp4

38617b4

upgrade modelopt

d439d96

github-actions bot added the component: build system Issues re: Build system label May 1, 2025

add constant fold

5a2213e

github-actions bot added the component: lowering Issues re: The lowering / preprocessing passes label May 2, 2025

lanluo-nvidia added 10 commits May 1, 2025 21:47

fix the input tensor type issue

fcf0c12

test

057f35a

test

7b09862

test

d9f2ad9

test

6892a47

Merge branch 'main' into lluo/fp4

5198f9a

restructure the dynamic double quantize and static double quantize code

559ada5

add test code

bba1d79

test

f16e58a

test

06c8126

lanluo-nvidia closed this May 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add fp4 support(WIP) #3496

Add fp4 support(WIP) #3496

Uh oh!

lanluo-nvidia commented Apr 29, 2025 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

narendasan commented Apr 29, 2025

Uh oh!

Uh oh!

Add fp4 support(WIP) #3496

Add fp4 support(WIP) #3496

Uh oh!

Conversation

lanluo-nvidia commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist:

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

narendasan commented Apr 29, 2025

Uh oh!

Uh oh!

lanluo-nvidia commented Apr 29, 2025 •

edited

Loading