Skip to content

✨[Feature] Print out Python file name, line number, and/or op for warnings, compilation failures #949

Closed
@chaoz-dev

Description

@chaoz-dev

Is your feature request related to a problem? Please describe.
By default, when running Torch-TensorRT from Python, most errors encountered during compilation will print a stack trace originating from either the Python compilation command (eg. printing something like torch_tensorrt_pypi/torch_tensorrt/ts/_compiler.py", line 119, in compile compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec)). RuntimeError: ....) or the CPP file backing the implementation.

Warnings during runtime may be printed without any identifying information.

The current stack traces and output by themselves do not provide enough information to diagnose or pinpoint which PyTorch op is causing the offending behavior. This is especially egregious when attempting to apply Torch-TensorRT to large and complex models, where failures with the conversion process are currently difficult to identify without unwinding and stepping through the model op by op to locate the root cause of conversion failures.

Describe the solution you'd like
Ideally, when an error or warning occurs, we should print output identifying the offending op (at minimum), as well as the file and line number at which the op occurs, in addition to other stack traces, errors, and warnings.

Describe alternatives you've considered
Currently, I've tried log printing at the graph level to better identify errors during the conversion process. However, there are issues with this approach:

  1. We may encounter the error before printing the next node or subgraph where the error occurs;
  2. The output appears correct, and does not identify the offending op, anyway

Additional context
I think this ask is potentially complicated but somewhat possible... On the one hand, Torch-TensorRT converts using the JIT output, which means obtaining metadata could be more complex than just reading the model directly. On the other hand, looking at the Interpreting Graphs section of the documentation, it appears file and line number information should be accessible somewhere...

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions