Skip to content

Commit 81cded1

Browse files
author
Wei
authored
Update getting_started_with_fx_path.rst (#1342)
1 parent 5f4588e commit 81cded1

File tree

1 file changed

+37
-2
lines changed

1 file changed

+37
-2
lines changed

docsrc/tutorials/getting_started_with_fx_path.rst

Lines changed: 37 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,42 @@ user want to use this tool and we will introduce them here.
2020
Converting a PyTorch Model to TensorRT Engine
2121
---------------------------------------------
2222
In general, users are welcome to use the ``compile()`` to finish the conversion from a model to tensorRT engine. It is a
23-
wrapper API that consists of the major steps needed to finish this converison. Please refer to ``lower_example.py`` file in ``examples/fx``.
23+
wrapper API that consists of the major steps needed to finish this converison. Please refer to an example usage in ``lower_example.py`` file under ``examples/fx``.
24+
25+
.. code-block:: shell
26+
27+
def compile(
28+
module: nn.Module,
29+
input,
30+
max_batch_size: int = 2048,
31+
max_workspace_size=1 << 25,
32+
explicit_batch_dimension=False,
33+
lower_precision=LowerPrecision.FP16,
34+
verbose_log=False,
35+
timing_cache_prefix="",
36+
save_timing_cache=False,
37+
cuda_graph_batch_size=-1,
38+
dynamic_batch=True,
39+
) -> nn.Module:
40+
"""
41+
Takes in original module, input and lowering setting, run lowering workflow to turn module
42+
into lowered module, or so called TRTModule.
43+
44+
Args:
45+
module: Original module for lowering.
46+
input: Input for module.
47+
max_batch_size: Maximum batch size (must be >= 1 to be set, 0 means not set)
48+
max_workspace_size: Maximum size of workspace given to TensorRT.
49+
explicit_batch_dimension: Use explicit batch dimension in TensorRT if set True, otherwise use implicit batch dimension.
50+
lower_precision: lower_precision config given to TRTModule.
51+
verbose_log: Enable verbose log for TensorRT if set True.
52+
timing_cache_prefix: Timing cache file name for timing cache used by fx2trt.
53+
save_timing_cache: Update timing cache with current timing cache data if set to True.
54+
cuda_graph_batch_size: Cuda graph batch size, default to be -1.
55+
dynamic_batch: batch dimension (dim=0) is dynamic.
56+
Returns:
57+
A torch.nn.Module lowered by TensorRT.
58+
"""
2459
2560
In this section, we will go through an example to illustrate the major steps that fx path uses.
2661
Users can refer to ``fx2trt_example.py`` file in ``examples/fx``.
@@ -56,7 +91,7 @@ Explicit batch is the default mode and it must be set for dynamic shape. For mos
5691
5792
For examples of the last path, if we have a 3D tensor t shaped as (batch, sequence, dimension), operations such as torch.transpose(0, 2). If any of these three are not satisfied, we’ll need to specify InputTensorSpec as inputs with dynamic range.
5893
59-
.. code-block:: shell
94+
c
6095
6196
import deeplearning.trt.fx2trt.converter.converters
6297
from torch.fx.experimental.fx2trt.fx2trt import InputTensorSpec, TRTInterpreter

0 commit comments

Comments
 (0)