Description
Bug Description
When compiling the encoder transformer model for Sockeye inference, Torch-TensorRT throws a runtime error.
To Reproduce
Steps to reproduce the behavior:
- Start a docker container
docker run --gpus all --rm -it nvcr.io/nvidia/pytorch:21.11-py3
- Run the following to download+preprocess data and train a basic model:
git clone https://github.com/blchu/sockeye.git -b tensorrt_blchu
tail -n 4 sockeye/requirements/requirements.txt > requirements.txt.tmp \
&& mv requirements.txt.tmp sockeye/requirements/requirements.txt
pip install -e ./sockeye
git clone https://github.com/rsennrich/subword-nmt.git
export PYTHONPATH=$(pwd)/subword-nmt:$PYTHONPATH
wget http://data.statmt.org/wmt17/translation-task/preprocessed/de-en/corpus.tc.de.gz
wget http://data.statmt.org/wmt17/translation-task/preprocessed/de-en/corpus.tc.en.gz
gunzip corpus.tc.de.gz
gunzip corpus.tc.en.gz
curl https://data.statmt.org/wmt17/translation-task/preprocessed/de-en/dev.tgz | tar xvzf -
head -n 32768 corpus.tc.de > corpus.tc.de.tmp && mv corpus.tc.de.tmp corpus.tc.de
head -n 32768 corpus.tc.en > corpus.tc.en.tmp && mv corpus.tc.en.tmp corpus.tc.en
python -m learn_joint_bpe_and_vocab --input corpus.tc.de corpus.tc.en \
-s 3000 \
-o bpe.codes \
--write-vocabulary bpe.vocab.de bpe.vocab.en
python -m apply_bpe -c bpe.codes --vocabulary bpe.vocab.de --vocabulary-threshold 50 < corpus.tc.de > corpus.tc.BPE.de
python -m apply_bpe -c bpe.codes --vocabulary bpe.vocab.en --vocabulary-threshold 50 < corpus.tc.en > corpus.tc.BPE.en
python -m apply_bpe -c bpe.codes --vocabulary bpe.vocab.de --vocabulary-threshold 50 < newstest2016.tc.de > newstest2016.tc.BPE.de
python -m apply_bpe -c bpe.codes --vocabulary bpe.vocab.en --vocabulary-threshold 50 < newstest2016.tc.en > newstest2016.tc.BPE.en
python -m sockeye.prepare_data_pt \
-s corpus.tc.BPE.de \
-t corpus.tc.BPE.en \
-o train_data \
--shared-vocab
torchrun --no_python --nproc_per_node 1 sockeye-train \
--prepared-data train_data \
--validation-source newstest2016.tc.BPE.de \
--validation-target newstest2016.tc.BPE.en \
--output model \
--batch-size 2048 \
--update-interval 1 \
--checkpoint-interval 1 \
--max-updates 1 \
--decoder ssru_transformer \
--shared-vocab \
--seed 1 \
--quiet-secondary-workers
- Run the translate command to attempt to compile with Torch-TensorRT, here is where the error should occur:
sockeye-translate \
--input newstest2016.tc.BPE.de \
--output out \
--model model \
--dtype float16 \
--beam-size 5 \
--batch-size 64 \
--output-type benchmark
Stack trace and logs:
WARNING: [Torch-TensorRT] - Input type for doing shape analysis could not be determined, defaulting to F32
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
WARNING: [Torch-TensorRT] - Truncating weight (constant in the graph) from Float64 to Float32
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Detected invalid timing cache, setup a local cache instead
WARNING: [Torch-TensorRT TorchScript Conversion Context] - The logger passed into createInferBuilder differs from one already provided for an existing builder, runtime, or refitter. TensorRT maintains only a single logger pointer at any given time, so the existing value, which can be retrieved with getLogger(), will be used instead. In order to use a new logger, first destroy all existing builder, runner or refitter objects.
WARNING: [Torch-TensorRT] - There may be undefined behavior using dynamic shape and aten::size
[ERROR:root] Uncaught exception
Traceback (most recent call last):
File "/opt/conda/bin/sockeye-translate", line 33, in <module>
sys.exit(load_entry_point('sockeye', 'console_scripts', 'sockeye-translate')())
File "/workspace/temp/sockeye/sockeye/translate_pt.py", line 43, in main
run_translate(args)
File "/workspace/temp/sockeye/sockeye/translate_pt.py", line 147, in run_translate
read_and_translate(translator=translator,
File "/workspace/temp/sockeye/sockeye/translate_pt.py", line 234, in read_and_translate
chunk_time = translate(output_handler, chunk, translator)
File "/workspace/temp/sockeye/sockeye/translate_pt.py", line 257, in translate
trans_outputs = translator.translate(trans_inputs)
File "/workspace/temp/sockeye/sockeye/inference_pt.py", line 807, in translate
batch_translations = self._translate_np(*self._get_inference_input(translator_inputs)) # type: ignore
File "/workspace/temp/sockeye/sockeye/inference_pt.py", line 995, in _translate_np
return self._get_best_translations(self._search(source,
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/workspace/temp/sockeye/sockeye/beam_search_pt.py", line 778, in forward
model_states, estimated_reference_lengths = self._inference.encode_and_initialize(source, source_length)
File "/workspace/temp/sockeye/sockeye/beam_search_pt.py", line 70, in encode_and_initialize
states, predicted_output_length = self._model.encode_and_initialize(inputs, valid_length, self._const_lr)
File "/workspace/temp/sockeye/sockeye/model_pt.py", line 234, in encode_and_initialize
source_encoded, source_encoded_lengths = self.encode(inputs, valid_length=valid_length)
File "/workspace/temp/sockeye/sockeye/model_pt.py", line 200, in encode
self.traced_encoder = torch_tensorrt.compile(self.traced_encoder,
File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/_compile.py", line 97, in compile
return torch_tensorrt.ts.compile(ts_mod, inputs=inputs, enabled_precisions=enabled_precisions, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/torch_tensorrt/ts/_compiler.py", line 119, in compile
compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec))
RuntimeError: [Error thrown at ./core/conversion/var/Var_inl.h:38] Expected ivalue->isInt() to be true but got false
Requested unwrapping of arg IValue assuming it was l however type is NoneType
Expected behavior
The model should compile successfully without error and translate sentences
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
- Torch-TensorRT Version (e.g. 1.0.0): 1.11.0a0+b6df043'
- PyTorch Version (e.g. 1.0): 1.0.0a0
- CPU Architecture: x86_64 (Intel Xeon Platinum 8259CL)
- OS (e.g., Linux): Ubuntu 20.04
- How you installed PyTorch (
conda
,pip
,libtorch
, source): NGC Container - Python version: 3.8.12
- CUDA version: 11.5
- GPU models and configuration: Tesla T4