@@ -21,101 +21,107 @@ torchtrtc [input_file_path] [output_file_path]
21
21
22
22
OPTIONS:
23
23
24
- -h, --help Display this help menu
25
- Verbiosity of the compiler
26
- -v, --verbose Dumps debugging information about the
27
- compilation process onto the console
28
- -w, --warnings Disables warnings generated during
29
- compilation onto the console (warnings
30
- are on by default)
31
- --i, --info Dumps info messages generated during
32
- compilation onto the console
33
- --build-debuggable-engine Creates a debuggable engine
34
- --allow-gpu-fallback (Only used when targeting DLA
35
- (device-type)) Lets engine run layers on
36
- GPU if they are not supported on DLA
37
- --require-full-compilation Require that the model should be fully
38
- compiled to TensorRT or throw an error
39
- --disable-tf32 Prevent Float32 layers from using the
40
- TF32 data format
41
- --sparse-weights Enable sparsity for weights of conv and
42
- FC layers
43
- -p[precision...],
44
- --enabled-precision=[precision...]
45
- (Repeatable) Enabling an operating
46
- precision for kernels to use when
47
- building the engine (Int8 requires a
48
- calibration-cache argument) [ float |
49
- float32 | f32 | fp32 | half | float16 |
50
- f16 | fp16 | int8 | i8 | char ]
51
- (default: float)
52
- -d[type], --device-type=[type] The type of device the engine should be
53
- built for [ gpu | dla ] (default: gpu)
54
- --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
55
- (defaults to 0)
56
- --dla-core=[dla_core] DLACore id if running on available DLA
57
- (defaults to 0)
58
- --engine-capability=[capability] The type of device the engine should be
59
- built for [ standard | safety |
60
- dla_standalone ]
61
- --calibration-cache-file=[file_path]
62
- Path to calibration cache file to use
63
- for post training quantization
64
- --teo=[torch-executed-ops...],
65
- --torch-executed-ops=[torch-executed-ops...]
66
- (Repeatable) Operator in the graph that
67
- should always be run in PyTorch for
68
- execution (partial compilation must be
69
- enabled)
70
- --tem=[torch-executed-mods...],
71
- --torch-executed-mods=[torch-executed-mods...]
72
- (Repeatable) Module that should always
73
- be run in Pytorch for execution (partial
74
- compilation must be enabled)
75
- --mbs=[torch-executed-mods...],
76
- --min-block-size=[torch-executed-mods...]
77
- Minimum number of contiguous TensorRT
78
- supported ops to compile a subgraph to
79
- TensorRT
80
- --embed-engine Whether to treat input file as a
81
- serialized TensorRT engine and embed it
82
- into a TorchScript module (device spec
83
- must be provided)
84
- --num-min-timing-iter=[num_iters] Number of minimization timing iterations
85
- used to select kernels
86
- --num-avg-timing-iters=[num_iters]
87
- Number of averaging timing iterations
88
- used to select kernels
89
- --workspace-size=[workspace_size] Maximum size of workspace given to
90
- TensorRT
91
- -t[threshold],
92
- --threshold=[threshold] Maximum acceptable numerical deviation
93
- from standard torchscript output
94
- (default 2e-5)
95
- --no-threshold-check Skip checking threshold compliance
96
- --truncate-long-double,
97
- --truncate, --truncate-64bit Truncate weights that are provided in
98
- 64bit to 32bit (Long, Double to Int,
99
- Float)
100
- --save-engine Instead of compiling a full a
101
- TorchScript program, save the created
102
- engine to the path specified as the
103
- output path
104
- input_file_path Path to input TorchScript file
105
- output_file_path Path for compiled TorchScript (or
106
- TensorRT engine) file
107
- input_specs... Specs for inputs to engine, can either
108
- be a single size or a range defined by
109
- Min, Optimal, Max sizes, e.g.
110
- "(N,..,C,H,W)"
111
- "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
112
- Data Type and format can be specified by
113
- adding an "@" followed by dtype and "%"
114
- followed by format to the end of the
115
- shape spec. e.g. "(3, 3, 32,
116
- 32)@f16%NHWC"
117
- "--" can be used to terminate flag options and force all following
118
- arguments to be treated as positional options
24
+ -h, --help Display this help menu
25
+ Verbiosity of the compiler
26
+ -v, --verbose Dumps debugging information about the
27
+ compilation process onto the console
28
+ -w, --warnings Disables warnings generated during
29
+ compilation onto the console (warnings
30
+ are on by default)
31
+ --i, --info Dumps info messages generated during
32
+ compilation onto the console
33
+ --build-debuggable-engine Creates a debuggable engine
34
+ --use-strict-types Restrict operating type to only use set
35
+ operation precision
36
+ --allow-gpu-fallback (Only used when targeting DLA
37
+ (device-type)) Lets engine run layers on
38
+ GPU if they are not supported on DLA
39
+ --require-full-compilation Require that the model should be fully
40
+ compiled to TensorRT or throw an error
41
+ --is-supported=[method_name],
42
+ --supported=[method_name],
43
+ --check-support=[method_name],
44
+ --check-method-op-support=[method_name]
45
+ Check the support for end to end
46
+ compilation of a specified method in the
47
+ TorchScript module
48
+ --disable-tf32 Prevent Float32 layers from using the
49
+ TF32 data format
50
+ --sparse-weights Enable sparsity for weights of conv and
51
+ FC layers
52
+ -p[precision...],
53
+ --enable-precision=[precision...] (Repeatable) Enabling an operating
54
+ precision for kernels to use when
55
+ building the engine (Int8 requires a
56
+ calibration-cache argument) [ float |
57
+ float32 | f32 | fp32 | half | float16 |
58
+ f16 | fp16 | int8 | i8 | char ]
59
+ (default: float)
60
+ -d[type], --device-type=[type] The type of device the engine should be
61
+ built for [ gpu | dla ] (default: gpu)
62
+ --gpu-id=[gpu_id] GPU id if running on multi-GPU platform
63
+ (defaults to 0)
64
+ --dla-core=[dla_core] DLACore id if running on available DLA
65
+ (defaults to 0)
66
+ --engine-capability=[capability] The type of device the engine should be
67
+ built for [ standard | safety |
68
+ dla_standalone ]
69
+ --calibration-cache-file=[file_path]
70
+ Path to calibration cache file to use
71
+ for post training quantization
72
+ --teo=[op_name...],
73
+ --torch-executed-op=[op_name...] (Repeatable) Operator in the graph that
74
+ should always be run in PyTorch for
75
+ execution (partial compilation must be
76
+ enabled)
77
+ --tem=[module_name...],
78
+ --torch-executed-mod=[module_name...]
79
+ (Repeatable) Module that should always
80
+ be run in Pytorch for execution (partial
81
+ compilation must be enabled)
82
+ --mbs=[min-block-size],
83
+ --min-block-size=[min-block-size] Minimum number of contiguous TensorRT
84
+ supported ops to compile a subgraph to
85
+ TensorRT
86
+ --embed-engine Whether to treat input file as a
87
+ serialized TensorRT engine and embed it
88
+ into a TorchScript module (device spec
89
+ must be provided)
90
+ --num-min-timing-iter=[num_iters] Number of minimization timing iterations
91
+ used to select kernels
92
+ --num-avg-timing-iters=[num_iters]
93
+ Number of averaging timing iterations
94
+ used to select kernels
95
+ --workspace-size=[workspace_size] Maximum size of workspace given to
96
+ TensorRT
97
+ -t[threshold],
98
+ --threshold=[threshold] Maximum acceptable numerical deviation
99
+ from standard torchscript output
100
+ (default 2e-5)
101
+ --no-threshold-check Skip checking threshold compliance
102
+ --truncate-long-double,
103
+ --truncate, --truncate-64bit Truncate weights that are provided in
104
+ 64bit to 32bit (Long, Double to Int,
105
+ Float)
106
+ --save-engine Instead of compiling a full a
107
+ TorchScript program, save the created
108
+ engine to the path specified as the
109
+ output path
110
+ input_file_path Path to input TorchScript file
111
+ output_file_path Path for compiled TorchScript (or
112
+ TensorRT engine) file
113
+ input_specs... Specs for inputs to engine, can either
114
+ be a single size or a range defined by
115
+ Min, Optimal, Max sizes, e.g.
116
+ "(N,..,C,H,W)"
117
+ "[(MIN_N,..,MIN_C,MIN_H,MIN_W);(OPT_N,..,OPT_C,OPT_H,OPT_W);(MAX_N,..,MAX_C,MAX_H,MAX_W)]".
118
+ Data Type and format can be specified by
119
+ adding an "@" followed by dtype and "%"
120
+ followed by format to the end of the
121
+ shape spec. e.g. "(3, 3, 32,
122
+ 32)@f16%NHWC"
123
+ "--" can be used to terminate flag options and force all following
124
+ arguments to be treated as positional options
119
125
```
120
126
121
127
e.g.
0 commit comments