You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Compile an ExportedProgram module for NVIDIA GPUs using TensorRT
@@ -137,6 +138,7 @@ def compile(
137
138
enable_experimental_decompositions (bool): Use the full set of operator decompositions. These decompositions may not be tested but serve to make the grap easier to covert to TensorRT, potentially increasing the amount of graphs run in TensorRT.
138
139
dryrun (bool): Toggle for "Dryrun" mode, running everything except conversion to TRT and logging outputs
139
140
hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer)
141
+
timing_cache_path (str): Path to the timing cache if it exists (or) where it will be saved after compilation
140
142
**kwargs: Any,
141
143
Returns:
142
144
torch.fx.GraphModule: Compiled FX Module, when run it will execute via TensorRT
dla_global_dram_size (int): Host RAM used by DLA to store weights and metadata for execution
533
537
calibrator (Union(torch_tensorrt._C.IInt8Calibrator, tensorrt.IInt8Calibrator)): Calibrator object which will provide data to the PTQ system for INT8 Calibration
534
538
allow_shape_tensors: (Experimental) Allow aten::size to output shape tensors using IShapeLayer in TensorRT
535
-
539
+
timing_cache_path (str): Path to the timing cache if it exists (or) where it will be saved after compilation
536
540
Returns:
537
541
bytes: Serialized TensorRT engine, can either be saved to a file or deserialized via TensorRT APIs
Copy file name to clipboardExpand all lines: py/torch_tensorrt/dynamo/_settings.py
+3Lines changed: 3 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,7 @@
24
24
REFIT,
25
25
REQUIRE_FULL_COMPILATION,
26
26
SPARSE_WEIGHTS,
27
+
TIMING_CACHE_PATH,
27
28
TRUNCATE_DOUBLE,
28
29
USE_FAST_PARTITIONER,
29
30
USE_PYTHON_RUNTIME,
@@ -71,6 +72,7 @@ class CompilationSettings:
71
72
TRT Engines. Prints detailed logs of the graph structure and nature of partitioning. Optionally saves the
72
73
ouptut to a file if a string path is specified
73
74
hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer)
75
+
timing_cache_path (str): Path to the timing cache if it exists (or) where it will be saved after compilation
0 commit comments