-
Notifications
You must be signed in to change notification settings - Fork 918
Closed
Description
After launching a new process with MPI_Comm_spawn
, the call to allocate a shared memory region fails:
[mio:03625] *** An error occurred in MPI_Win_allocate_shared
[mio:03625] *** reported by process [3230072833,0]
[mio:03625] *** on communicator
[mio:03625] *** MPI_ERR_RMA_SHARED: Memory cannot be shared
[mio:03625] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[mio:03625] *** and potentially your MPI job)
[mio:03620] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[mio:03620] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
The following reproducer works with Intel MPI, buf fails with OpenMPI.
Parent code:
#define _GNU_SOURCE
#include <sched.h>
#include <stdio.h>
#include <mpi.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
int
main(int argc, char *argv[])
{
int provided, n, rank, size;
MPI_Comm intercomm, universe;
MPI_Win win;
int *buf;
MPI_Aint bufsize;
int disp = sizeof(int);
int k;
MPI_Init_thread(NULL, NULL, MPI_THREAD_MULTIPLE, &provided);
assert(provided == MPI_THREAD_MULTIPLE);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_spawn("./worker", MPI_ARGV_NULL, 1,
MPI_INFO_NULL, 0, MPI_COMM_WORLD,
&intercomm, MPI_ERRCODES_IGNORE);
MPI_Intercomm_merge(intercomm, 0, &universe);
MPI_Comm_size(universe, &k);
assert(k == 2);
bufsize = sizeof(int);
MPI_Win_allocate_shared(bufsize, 1, MPI_INFO_NULL, universe, &buf, &win);
buf[0] = 666;
MPI_Barrier(universe);
/* Worker runs now */
MPI_Barrier(universe);
assert(buf[0] == 555);
MPI_Finalize();
return 0;
}
Worker code:
#include <mpi.h>
#include <stddef.h>
#include <assert.h>
int
main(int argc, char *argv[])
{
MPI_Comm parent, universe;
int *buf;
size_t bufsize;
int rank, disp;
char hostname[100];
MPI_Win win;
MPI_Aint asize;
MPI_Init(&argc, &argv);
MPI_Comm_get_parent(&parent);
assert(parent != MPI_COMM_NULL);
MPI_Intercomm_merge(parent, 0, &universe);
MPI_Win_allocate_shared(0, 1, MPI_INFO_NULL, universe, &buf, &win);
MPI_Win_shared_query(win, MPI_PROC_NULL, &asize, &disp, &buf);
MPI_Barrier(universe);
assert(buf[0] == 666);
buf[0] = 555;
MPI_Barrier(universe);
MPI_Finalize();
return 0;
}
Makefile:
CC=mpicc
CFLAGS=-g -O0
BIN=main worker
all: $(BIN)
clean:
rm -f $(BIN) *.o
To run, I use mpirun --oversubscribe -n 1 ./main
.
Background information
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
Release v4.0.1
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Arch Linux package install, build date Fri 29 Mar 2019.
Please describe the system on which you are running
- Operating system/version: 5.1.16-arch1-1-ARCH