feat: Autocast #3878

zewenli98 · 2025-10-28T05:15:58Z

Description

Weak typing behavior in TensorRT is deprecated. However it is a good way to maximize performance. Therefore, we want to create similar PyTorch native system to use with Torch-TensorRT that recovers some of this behavior.

Fixes #3869

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

py/torch_tensorrt/dynamo/_compiler.py

zewenli98 · 2025-10-29T02:17:51Z

py/torch_tensorrt/dynamo/_compiler.py

+    enable_autocast: bool = _defaults.ENABLE_AUTOCAST,
+    low_precision_type: Optional[
+        Union[torch.dtype, dtype]
+    ] = _defaults.LOW_PRECISION_TYPE,
+    nodes_to_exclude: Collection[str] = _defaults.NODES_TO_EXCLUDE,
+    targets_to_exclude: Collection[Target] = _defaults.TARGETS_TO_EXCLUDE,
+    data_max: float = _defaults.DATA_MAX,
+    max_depth_of_reduction: Optional[int] = _defaults.MAX_DEPTH_OF_REDUCTION,


Before merging, these args should be added to other compile functions in this file.

zewenli98 · 2025-10-29T02:19:36Z

py/torch_tensorrt/dynamo/lowering/passes/nodeclassifier.py

+        ]:
+            # GEMM: A (M, K) @ B (K, N) = C (M, N)
+            self.reduction_depth = input_0_dims[-1]
+        # TODO: Add more reduction ops here


Should any more reduction targets be added?

How are these reduction targets chosen?

py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py

peri044

Can you also update the documentation at https://github.com/pytorch/TensorRT/blob/main/docsrc/user_guide/mixed_precision.rst

core/runtime/execute_engine.cpp

examples/dynamo/autocast_example.py

py/torch_tensorrt/dynamo/lowering/passes/nodeclassifier.py

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py

py/torch_tensorrt/dynamo/_compiler.py

narendasan · 2025-11-06T19:24:06Z

For Tests

Should external autocast in pytorch with strong typing
Whole graph autocast pass
a test case that exercises max_output_threshold fallback

L1 or L2 tests

…dd tests

docsrc/user_guide/mixed_precision.rst

examples/dynamo/autocast_example.py

py/torch_tensorrt/dynamo/lowering/passes/_aten_lowering_pass.py

narendasan

Nice its looking good, some final polishing details then I think its good to go

zewenli98 added 2 commits October 27, 2025 21:59

implement autocast

eac8809

fix bug

f6c7c7c

zewenli98 self-assigned this Oct 28, 2025

meta-cla bot added the cla signed label Oct 28, 2025

github-actions bot requested a review from apbose October 28, 2025 05:16

zewenli98 removed the request for review from apbose October 28, 2025 05:16

add arg enable_autocast

f7d8068

github-actions bot removed the component: conversion Issues re: Conversion stage label Oct 29, 2025

zewenli98 commented Oct 29, 2025

View reviewed changes

py/torch_tensorrt/dynamo/_compiler.py Outdated Show resolved Hide resolved

zewenli98 commented Oct 29, 2025

View reviewed changes

py/torch_tensorrt/dynamo/runtime/_PythonTorchTensorRTModule.py Show resolved Hide resolved

zewenli98 commented Oct 30, 2025

View reviewed changes

py/torch_tensorrt/dynamo/lowering/passes/rule_based_autocast.py Outdated Show resolved Hide resolved

zewenli98 requested review from narendasan and peri044 October 30, 2025 02:13

peri044 reviewed Oct 30, 2025

View reviewed changes

zewenli98 added 2 commits November 4, 2025 18:50

change names of API and support for user specified node names

e15ce94

support dataloader for calibration

94757d2

zewenli98 added 2 commits November 6, 2025 18:22

fix comments

4bf12e7

optimize Cast insertion logic, fix io dtype issue and comments, and a…

0a62149

…dd tests

github-actions bot added component: tests Issues re: Tests and removed component: core Issues re: The core compiler labels Nov 8, 2025

fix bugs in cpp runtime

a990653

amend doc and settings

3e008c2

github-actions bot added the documentation Improvements or additions to documentation label Nov 14, 2025