-
Notifications
You must be signed in to change notification settings - Fork 930
Closed
Labels
Description
Seeing this on the current master (using orte/test/mpi/simple_spawn.c):
$ mpirun -H rhc001:24 -n 3 ./simple_spawn
[17760257:0 pid 176346] starting up on node rhc001!
[17760257:1 pid 176347] starting up on node rhc001!
[17760257:2 pid 176348] starting up on node rhc001!
0 completed MPI_Init
2 completed MPI_Init
Parent [pid 176348] about to spawn!
Parent [pid 176346] about to spawn!
1 completed MPI_Init
Parent [pid 176347] about to spawn!
[17760258:0 pid 176355] starting up on node rhc001!
[17760258:1 pid 176356] starting up on node rhc001!
[17760258:2 pid 176357] starting up on node rhc001!
Parent done with spawn
Parent sending message to child
Parent done with spawn
Parent done with spawn
2 completed MPI_Init
Hello from the child 2 of 3 on host rhc001 pid 176357
0 completed MPI_Init
Hello from the child 0 of 3 on host rhc001 pid 176355
1 completed MPI_Init
Hello from the child 1 of 3 on host rhc001 pid 176356
Parent disconnected
Child 2 disconnected
Parent disconnected
Child 1 disconnected
Child 0 received msg: 38
<hang forever>
I'm not sure when this started. @ggouaillardet Would you have a chance to take a peek? If it is PMIx related, please let me know and I'll dive into it.