-
Notifications
You must be signed in to change notification settings - Fork 489
Description
I'm running an application on a cluster that uses CUDA Aware MPICH (v4.2.2) and UCX (v1.17.0). My application consists of two binaries, a server and a client, so I use the MPMD mode of mpirun to execute it: mpirun -np 1 server : -np 1 client. The problem is that when I try to run the application, either intra-node or inter-node, I get the following error and the application hangs:
[1724426336.207066] [c066:48733:0] ib_md.c:293 UCX ERROR ibv_reg_mr(address=0x55bace2c02a0, length=49792, access=0xf) failed: Bad address
[1724426336.207083] [c066:48733:0] ucp_mm.c:70 UCX ERROR failed to register address 0x55bace2c02a0 (host) length 49792 on md[8]=mlx5_1: Input/output error (md supports: host|cuda)
After some research, I found that setting the environment variable UCX_RCACHE_ENABLE=n allows my application to run without errors. However, the application’s runtime performance is not as expected. Profiling the application revealed that most of the time is spent on data transfer between the nodes.
When running the OSU 7.4 benchmark, I observed that the bandwidth between nodes using InfiniBand is approximately 5.75 times slower when I set the variable UCX_RCACHE_ENABLE=n.
export UCX_RCACHE_ENABLE=y
mpirun -ppn 1 -np 2 osu_bw -m 100000000:1000000000 or
mpirun -ppn 1 -np 1 osu_bw -m 100000000:1000000000 : -np 1 osu_bw -m 100000000:1000000000
# OSU MPI Bandwidth Test v7.4
# Datatype: MPI_CHAR.
# Size Bandwidth (MB/s)
100000000 23073.19
200000000 23037.84
400000000 23001.02
800000000 22948.29
export UCX_RCACHE_ENABLE=n
mpirun -ppn 1 -np 2 osu_bw -m 100000000:1000000000 or
mpirun -ppn 1 -np 1 osu_bw -m 100000000:1000000000 : -np 1 osu_bw -m 100000000:1000000000
# OSU MPI Bandwidth Test v7.4
# Datatype: MPI_CHAR.
# Size Bandwidth (MB/s)
100000000 4083.05
200000000 4078.84
400000000 4076.13
800000000 4076.14
Any suggestions on why the application might be failing to register addresses?
Setup and versions
OS version:
cat /etc/redhat-release: Red Hat Enterprise Linux Server release 7.9 (Maipo)- Kernel
uname -r: 3.10.0-1160.49.1.el7.x86_64
RDMA/IB version:
rpm -q libibverbs: libibverbs-54mlnx1-1.54310.x86_64rpm -q rdma-core: rdma-core-devel-54mlnx1-1.54310.x86_64
IB HW:
- Each node has 2 IB NIC.
ibstat:
CA 'mlx5_0'
CA type: MT4115
Number of ports: 1
Firmware version: 12.27.1016
Hardware version: 0
Node GUID: 0x0800380300b49dac
System image GUID: 0x0800380300b49dac
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 115
LMC: 0
SM lid: 1
Capability mask: 0x2651e848
Port GUID: 0x0800380300b49dac
Link layer: InfiniBand
CA 'mlx5_1'
CA type: MT4115
Number of ports: 1
Firmware version: 12.27.1016
Hardware version: 0
Node GUID: 0x0800380300b49da0
System image GUID: 0x0800380300b49da0
Port 1:
State: Active
Physical state: LinkUp
Rate: 100
Base lid: 177
LMC: 0
SM lid: 1
Capability mask: 0x2651e848
Port GUID: 0x0800380300b49da0
Link layer: InfiniBand
CUDA 12.0:
-
Each node has four 32GB V100 GPUs
-
cuda libraries: cuda-toolkit-12-0-12.0.0-1.x86_64
-
cuda drivers: cuda-driver-devel-12-0-12.0.107-1.x86_64
-
lsmod |grep nv_peer_mem:
nv_peer_mem 13369 0
ib_core 358225 11 rdma_cm,ib_cm,iw_cm,beegfs,nv_peer_mem,ko2iblnd,mlx5_ib,ib_umad,ib_uverbs,rdma_ucm,ib_ipoib
nvidia 56056886 55 nv_peer_mem,gdrdrv,nvidia_modeset,nvidia_uvm
lsmod|grep gdrdrv:
gdrdrv 18183 0
nvidia 56056886 55 nv_peer_mem,gdrdrv,nvidia_modeset,nvidia_uvm
ucx_info -v:
# Library version: 1.17.0
# Library path: /home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/ucx-1.17.0-qq5l5fowibcomrutchar7maekewkiloo/lib/libucs.so.0
# API headers version: 1.17.0
# Git branch '', revision 4ef9a09
# Configured with: --disable-logging --disable-debug --disable-assertions --disable-params-check --prefix=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/ucx-1.17.0-qq5l5fowibcomrutchar7maekewkiloo --without-go --disable-doxygen-doc --disable-assertions --enable-compiler-opt=3 --without-java --enable-shared --enable-static --disable-logging --disable-mt --with-openmp --enable-optimizations --disable-params-check --disable-gtest --with-pic --with-cuda=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/cuda-12.4.0-tddfkicmflo4uydz5vvubsl5233hiasi --enable-cma --without-dc --without-dm --with-gdrcopy=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/gdrcopy-2.4.1-i7vxfrthjgn7ojewfj5a4pwsspcsg4te --with-ib-hw-tm --with-knem=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/knem-1.1.4-bhkutyn7invsbjv3e32yg3k5fiusiah6 --without-mlx5-dv --with-rc --with-ud --with-xpmem=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/xpmem-2.6.5-36-oeerzcdtxg5h6qhtv7s2nmmsh5imj4xl --without-fuse3 --without-bfd --with-rdmacm=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/rdma-core-52.0-frbk7sgqzmo2vjgu642ryhq26e3dxma7 --with-verbs=/home/jhonatan.cleto/spack/opt/spack/linux-rhel7-cascadelake/gcc-11.4.0/rdma-core-52.0-frbk7sgqzmo2vjgu642ryhq26e3dxma7 --with-avx --without-rocm