Update getting_started_with_fx_path.rst (#1342)

Wei · web-flow · commit 81cded107a67 · 2022-09-09T15:38:54.000-07:00
diff --git a/docsrc/tutorials/getting_started_with_fx_path.rst b/docsrc/tutorials/getting_started_with_fx_path.rst
@@ -20,7 +20,42 @@ user want to use this tool and we will introduce them here.
 Converting a PyTorch Model to TensorRT Engine
 ---------------------------------------------
 In general, users are welcome to use the ``compile()`` to finish the conversion from a model to tensorRT engine. It is a
-wrapper API that consists of the major steps needed to finish this converison. Please refer to ``lower_example.py`` file in ``examples/fx``.
+wrapper API that consists of the major steps needed to finish this converison. Please refer to an example usage in ``lower_example.py`` file under ``examples/fx``.
+
+.. code-block:: shell
+
+    def compile(
+        module: nn.Module,
+        input,
+        max_batch_size: int = 2048,
+        max_workspace_size=1 << 25,
+        explicit_batch_dimension=False,
+        lower_precision=LowerPrecision.FP16,
+        verbose_log=False,
+        timing_cache_prefix="",
+        save_timing_cache=False,
+        cuda_graph_batch_size=-1,
+        dynamic_batch=True,
+    ) -> nn.Module:
+        """
+        Takes in original module, input and lowering setting, run lowering workflow to turn module
+        into lowered module, or so called TRTModule.
+
+        Args:
+            module: Original module for lowering.
+            input: Input for module.
+            max_batch_size: Maximum batch size (must be >= 1 to be set, 0 means not set)
+            max_workspace_size: Maximum size of workspace given to TensorRT.
+            explicit_batch_dimension: Use explicit batch dimension in TensorRT if set True, otherwise use implicit batch dimension.
+            lower_precision: lower_precision config given to TRTModule.
+            verbose_log: Enable verbose log for TensorRT if set True.
+            timing_cache_prefix: Timing cache file name for timing cache used by fx2trt.
+            save_timing_cache: Update timing cache with current timing cache data if set to True.
+            cuda_graph_batch_size: Cuda graph batch size, default to be -1.
+            dynamic_batch: batch dimension (dim=0) is dynamic.
+        Returns:
+            A torch.nn.Module lowered by TensorRT.
+        """
 
 In this section, we will go through an example to illustrate the major steps that fx path uses.
 Users can refer to ``fx2trt_example.py`` file in ``examples/fx``.
@@ -56,7 +91,7 @@ Explicit batch is the default mode and it must be set for dynamic shape. For mos
 
 For examples of the last path, if we have a 3D tensor t shaped as (batch, sequence, dimension), operations such as torch.transpose(0, 2). If any of these three are not satisfied, we’ll need to specify InputTensorSpec as inputs with dynamic range.
 
-.. code-block:: shell
+c
 
     import deeplearning.trt.fx2trt.converter.converters
     from torch.fx.experimental.fx2trt.fx2trt import InputTensorSpec, TRTInterpreter