Description
Bug Description
When compiling an MLP submodule of the HuggingFace GPT2 model from TorchScript to Torch-TRT, the following error is encountered:
GRAPH: [Torch-TensorRT - Debug Build] - Input to node: %self.c_fc.weight.1 : Float(768, 3072, strides=[3072, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=<Tensor>]()
GRAPH: [Torch-TensorRT - Debug Build] - Input outputs a Tensor
GRAPH: [Torch-TensorRT - Debug Build] - Input is a constant
Traceback (most recent call last):
File "case_dict.py", line 278, in <module>
main2()
File "case_dict.py", line 268, in main2
comp = torchtrt.compile(traced, inputs=inp, enabled_precisions={torch.float}, truncate_long_and_double=True)
File "~/TensorRT/py/torch_tensorrt/_compile.py", line 125, in compile
return torch_tensorrt.ts.compile(
File "~/TensorRT/py/torch_tensorrt/ts/_compiler.py", line 136, in compile
compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: required keyword attribute 'upscale_factor' is undefined
To Reproduce
Steps to reproduce the behavior:
- Instantiate GPT2 model from pretrained source:
from transformers import GPT2Model
model = GPT2Model.from_pretrained("gpt2", use_cache=False, torchscript=True).eval().cuda()
- Select a small portion of the model:
model_portion = model.h[0].mlp
- Generate hidden state data:
hidden_state = torch.rand((1, 768)).cuda()
- Trace model
traced = torch.jit.trace(model_portion, inp).cuda().eval()
- Compile model:
trt_model = torchtrt.compile(traced, inputs=hidden_state, enabled_precisions={torch.float}, truncate_long_and_double=True)
Expected behavior
Module should compile via the TorchScript path
Environment
- Transformers: 4.27.2
- Torch-TensorRT Version (e.g. 1.0.0): 038520d
- PyTorch Version (e.g. 1.0): 2.1.0.dev20230317+cu117
- CPU Architecture: Intel Xeon CPU
- OS: Ubuntu 20.04
- How you installed PyTorch: pip
- Build command you used:
python setup.py develop
- Are you using local sources or building from archives: local
- Python version: 3.8.13
- CUDA version: 11.7
Additional Considerations
The full GPT2 model is functional - see #1455. Each of its component modules should be able to compile successfully as a result.