@@ -14,106 +14,108 @@ to standard TorchScript. Load with `torch.jit.load()` and run like you would run
1414
1515```
1616trtorchc [input_file_path] [output_file_path]
17- [input_specs...] {OPTIONS}
17+ [input_specs...] {OPTIONS}
1818
19- TRTorch is a compiler for TorchScript, it will compile and optimize
20- TorchScript programs to run on NVIDIA GPUs using TensorRT
19+ TRTorch is a compiler for TorchScript, it will compile and optimize
20+ TorchScript programs to run on NVIDIA GPUs using TensorRT
2121
22- OPTIONS:
22+ OPTIONS:
2323
24- -h, --help Display this help menu
25- Verbiosity of the compiler
26- -v, --verbose Dumps debugging information about the
27- compilation process onto the console
28- -w, --warnings Disables warnings generated during
29- compilation onto the console (warnings
30- are on by default)
31- --i, --info Dumps info messages generated during
32- compilation onto the console
33- --build-debuggable-engine Creates a debuggable engine
34- --use-strict-types Restrict operating type to only use set
35- operation precision
36- --allow-gpu-fallback (Only used when targeting DLA
37- (device-type)) Lets engine run layers on
38- GPU if they are not supported on DLA
39- --allow-torch-fallback Enable layers to run in torch if they
40- are not supported in TensorRT
41- --disable-tf32 Prevent Float32 layers from using the
42- TF32 data format
43- -p[precision...],
44- --enabled-precision=[precision...]
45- (Repeatable) Enabling an operating
46- precision for kernels to use when
47- building the engine (Int8 requires a
48- calibration-cache argument) [ float |
49- float32 | f32 | fp32 | half | float16 |
50- f16 | fp16 | int8 | i8 | char ]
51- (default: float)
52- -d[type], --device-type=[type] The type of device the engine should be
53- built for [ gpu | dla ] (default: gpu)
54- --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
55- (defaults to 0)
56- --dla-core=[dla_core] DLACore id if running on available DLA
57- (defaults to 0)
58- --engine-capability=[capability] The type of device the engine should be
59- built for [ standard | safety |
60- dla_standalone ]
61- --calibration-cache-file=[file_path]
62- Path to calibration cache file to use
63- for post training quantization
64- --ffo=[forced_fallback_ops...],
65- --forced-fallback-op=[forced_fallback_ops...]
66- (Repeatable) Operator in the graph that
67- should be forced to fallback to Pytorch
68- for execution (allow torch fallback must
69- be set)
70- --ffm=[forced_fallback_mods...],
71- --forced-fallback-mod=[forced_fallback_mods...]
72- (Repeatable) Module that should be
73- forced to fallback to Pytorch for
74- execution (allow torch fallback must be
75- set)
76- --embed-engine Whether to treat input file as a
77- serialized TensorRT engine and embed it
78- into a TorchScript module (device spec
79- must be provided)
80- --num-min-timing-iter=[num_iters] Number of minimization timing iterations
81- used to select kernels
82- --num-avg-timing-iters=[num_iters]
83- Number of averaging timing iterations
84- used to select kernels
85- --workspace-size=[workspace_size] Maximum size of workspace given to
86- TensorRT
87- --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
88- set, 0 means not set)
89- -t[threshold],
90- --threshold=[threshold] Maximum acceptable numerical deviation
91- from standard torchscript output
92- (default 2e-5)
93- --no-threshold-check Skip checking threshold compliance
94- --truncate-long-double,
95- --truncate, --truncate-64bit Truncate weights that are provided in
96- 64bit to 32bit (Long, Double to Int,
97- Float)
98- --save-engine Instead of compiling a full a
99- TorchScript program, save the created
100- engine to the path specified as the
101- output path
102- input_file_path Path to input TorchScript file
103- output_file_path Path for compiled TorchScript (or
104- TensorRT engine) file
105- input_specs... Specs for inputs to engine, can either
106- be a single size or a range defined by
107- Min, Optimal, Max sizes, e.g.
108- "(N,..,C,H,W)"
109- "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
110- Data Type and format can be specified by
111- adding an "@" followed by dtype and "%"
112- followed by format to the end of the
113- shape spec. e.g. "(3, 3, 32,
114- 32)@f16%NHWC"
115- "--" can be used to terminate flag options and force all following
116- arguments to be treated as positional options
24+ -h, --help Display this help menu
25+ Verbiosity of the compiler
26+ -v, --verbose Dumps debugging information about the
27+ compilation process onto the console
28+ -w, --warnings Disables warnings generated during
29+ compilation onto the console (warnings
30+ are on by default)
31+ --i, --info Dumps info messages generated during
32+ compilation onto the console
33+ --build-debuggable-engine Creates a debuggable engine
34+ --use-strict-types Restrict operating type to only use set
35+ operation precision
36+ --allow-gpu-fallback (Only used when targeting DLA
37+ (device-type)) Lets engine run layers on
38+ GPU if they are not supported on DLA
39+ --allow-torch-fallback Enable layers to run in torch if they
40+ are not supported in TensorRT
41+ --disable-tf32 Prevent Float32 layers from using the
42+ TF32 data format
43+ --sparse-weights Enable sparsity for weights of conv and
44+ FC layers
45+ -p[precision...],
46+ --enabled-precision=[precision...]
47+ (Repeatable) Enabling an operating
48+ precision for kernels to use when
49+ building the engine (Int8 requires a
50+ calibration-cache argument) [ float |
51+ float32 | f32 | fp32 | half | float16 |
52+ f16 | fp16 | int8 | i8 | char ]
53+ (default: float)
54+ -d[type], --device-type=[type] The type of device the engine should be
55+ built for [ gpu | dla ] (default: gpu)
56+ --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
57+ (defaults to 0)
58+ --dla-core=[dla_core] DLACore id if running on available DLA
59+ (defaults to 0)
60+ --engine-capability=[capability] The type of device the engine should be
61+ built for [ standard | safety |
62+ dla_standalone ]
63+ --calibration-cache-file=[file_path]
64+ Path to calibration cache file to use
65+ for post training quantization
66+ --ffo=[forced_fallback_ops...],
67+ --forced-fallback-op=[forced_fallback_ops...]
68+ (Repeatable) Operator in the graph that
69+ should be forced to fallback to Pytorch
70+ for execution (allow torch fallback must
71+ be set)
72+ --ffm=[forced_fallback_mods...],
73+ --forced-fallback-mod=[forced_fallback_mods...]
74+ (Repeatable) Module that should be
75+ forced to fallback to Pytorch for
76+ execution (allow torch fallback must be
77+ set)
78+ --embed-engine Whether to treat input file as a
79+ serialized TensorRT engine and embed it
80+ into a TorchScript module (device spec
81+ must be provided)
82+ --num-min-timing-iter=[num_iters] Number of minimization timing iterations
83+ used to select kernels
84+ --num-avg-timing-iters=[num_iters]
85+ Number of averaging timing iterations
86+ used to select kernels
87+ --workspace-size=[workspace_size] Maximum size of workspace given to
88+ TensorRT
89+ --max-batch-size=[max_batch_size] Maximum batch size (must be >= 1 to be
90+ set, 0 means not set)
91+ -t[threshold],
92+ --threshold=[threshold] Maximum acceptable numerical deviation
93+ from standard torchscript output
94+ (default 2e-5)
95+ --no-threshold-check Skip checking threshold compliance
96+ --truncate-long-double,
97+ --truncate, --truncate-64bit Truncate weights that are provided in
98+ 64bit to 32bit (Long, Double to Int,
99+ Float)
100+ --save-engine Instead of compiling a full a
101+ TorchScript program, save the created
102+ engine to the path specified as the
103+ output path
104+ input_file_path Path to input TorchScript file
105+ output_file_path Path for compiled TorchScript (or
106+ TensorRT engine) file
107+ input_specs... Specs for inputs to engine, can either
108+ be a single size or a range defined by
109+ Min, Optimal, Max sizes, e.g.
110+ "(N,..,C,H,W)"
111+ "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
112+ Data Type and format can be specified by
113+ adding an "@" followed by dtype and "%"
114+ followed by format to the end of the
115+ shape spec. e.g. "(3, 3, 32,
116+ 32)@f16%NHWC"
117+ "--" can be used to terminate flag options and force all following
118+ arguments to be treated as positional options
117119```
118120
119121e.g.
0 commit comments