System hangs when I use multiple GPUs

Single GPU is OK, System hangs when I use multiple GPUs. Can someone help solve this? Thanks.

python build.py --model_dir meta-llama/Llama-2-7b-chat-hf \
                --dtype float16 \
                --remove_input_padding \
                --use_gpt_attention_plugin float16 \
                --enable_context_fmha \
                --use_gemm_plugin float16 \
                --output_dir ./tmp/llama/7B/trt_engines/fp16/4-gpu/ \
                --world_size 4 \
                --tp_size 4

mpirun -n 4 --allow-run-as-root \
    python ../summarize.py --test_trt_llm \
                           --hf_model_dir meta-llama/Llama-2-7b-chat-hf \
                           --data_type fp16 \
                           --engine_dir ./tmp/llama/7B/trt_engines/fp16/4-gpu/


<img width="946" alt="image" src="https://github.com/NVIDIA/TensorRT-LLM/assets/3538547/46007b74-6e0f-4ec3-8330-8c4a37764670">


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

System hangs when I use multiple GPUs #827

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

System hangs when I use multiple GPUs #827

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions