1+ .. _multi_device_safe_mode :
2+
3+ Multi-Device Safe Mode
4+ ====================================
5+
16Multi-device safe mode is a setting in Torch-TensorRT which allows the user to determine whether
27the runtime checks for device consistency prior to every inference call.
38
49There is a non-negligible, fixed cost per-inference call when multi-device safe mode, which is why
510it is now disabled by default. It can be controlled via the following convenience function which
611doubles as a context manager.
7- ``` python
8- # Enables Multi Device Safe Mode
9- torch_tensorrt.runtime.set_multi_device_safe_mode(True )
1012
11- # Disables Multi Device Safe Mode [Default Behavior]
12- torch_tensorrt.runtime.set_multi_device_safe_mode(False )
13+ .. code-block :: python
14+
15+ # Enables Multi Device Safe Mode
16+ torch_tensorrt.runtime.set_multi_device_safe_mode(True )
17+
18+ # Disables Multi Device Safe Mode [Default Behavior]
19+ torch_tensorrt.runtime.set_multi_device_safe_mode(False )
20+
21+ # Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
22+ with torch_tensorrt.runtime.set_multi_device_safe_mode(True ):
23+ ...
1324
14- # Enables Multi Device Safe Mode, then resets the safe mode to its prior setting
15- with torch_tensorrt.runtime.set_multi_device_safe_mode(True ):
16- ...
17- ```
1825 TensorRT requires that each engine be associated with the CUDA context in the active thread from which it is invoked.
1926Therefore, if the device were to change in the active thread, which may be the case when invoking
2027engines on multiple GPUs from the same Python process, safe mode will cause Torch-TensorRT to display
@@ -24,5 +31,5 @@ device and CUDA context device, which could lead the program to crash.
2431One technique for managing multiple TRT engines on different GPUs while not sacrificing performance for
2532multi-device safe mode is to use Python threads. Each thread is responsible for all of the TRT engines
2633on a single GPU, and the default CUDA device on each thread corresponds to the GPU for which it is
27- responsible (can be set via ` torch.cuda.set_device(...) ` ). In this way, multiple threads can be used in the same
28- Python scripts without needing to switch CUDA contexts and incur performance overhead by leveraging threads .
34+ responsible (can be set via `` torch.cuda.set_device(...) ` `). In this way, multiple threads can be used in the same
35+ Python script without needing to switch CUDA contexts and incur performance overhead.
0 commit comments