Skip to content

Access violation when repeatedly creating/destroying inference session for TensorRT Execution Provider #24529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nietras opened this issue Apr 24, 2025 · 1 comment
Labels
api:CSharp issues related to the C# API ep:TensorRT issues related to TensorRT execution provider

Comments

@nietras
Copy link
Contributor

nietras commented Apr 24, 2025

Describe the bug

Access violation error being thrown during the engine building phase when using TensorRT ExecutionProvider when repeatedly creating and deallocating the inference session.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 x64
ONNX Runtime installed from (source or binary): binary
ONNX Runtime version: 1.19.2 AND 1.21.0
Python version: N/A (using C# wrapper)
Visual Studio version (if applicable): 2022
CUDA version:11.8.0 AND 12.8.1
cuDNN version: 8.9.7.29 AND 9.8.0.87
TensorRT version: 10.5.0.18 AND 10.8.0.43
GPU model and memory: RTX 3080

To Reproduce

Unfortunately, we have not been able to reproduce the access violation isolated, but in our use case we are simply creating a model from onnx with TensorRT Execution Provider, run for a while, then destroy it. Run without. Then create a new. And then after some N repeating that for unknown reasons the access violation occurs. Only need to do it a handful of times. Always happens. Note model works fine usually and has for a few years. Basic CNN.

Screenshots

Image

Code

https://github.com/microsoft/onnxruntime/blob/v1.19.2/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc#L2258

Additional context

As can be seen from screenshot we have been trying hard to debug this, but we have hit a dead end, we do not understand why this occurs. Hence, this issue, in the hope we can get help finding the issue and remedying. Understandably, this is a bit hard given we have been unable to isolate code in question that actually reproduces despite trying this, we are unsure why this is not possible, since we do not do anything that different in our app itself.

Similar issue is Access violation when using TensorRT ExecutionProvider on multiple GPU #7322 but no solution there for us it seems.

@github-actions github-actions bot added api:CSharp issues related to the C# API ep:TensorRT issues related to TensorRT execution provider labels Apr 24, 2025
@nietras nietras changed the title Access violation when repeatedly creating/destroying inferense session for TensorRT Execution Provider Access violation when repeatedly creating/destroying inference session for TensorRT Execution Provider Apr 24, 2025
@jywu-msft
Copy link
Member

do you know if this issue reproduces with native TensorRT (without OnnxRuntime)? i.e. using trt_exec to build the engine? (assuming the full graph is supported by TensorRT)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:CSharp issues related to the C# API ep:TensorRT issues related to TensorRT execution provider
Projects
None yet
Development

No branches or pull requests

2 participants