Skip to content

Deprecation of MPIR in favor of PMIx, but PMIx not functional equivalent to MPIR (yet) #6081

@ghost

Description

Background information

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

v4.0.0

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

From distribution tarball.

Please describe the system on which you are running

n/a


Details of the problem

Open MPI v4 has seen the deprecation of MPIR (#5974) and the MPIR support code will be completely removed in next summer’s release.

However, the very important use case of launching an MPI job using mpirun under the control of a debugger is currently not supported by PMIx. Since a while I work together with @rhc54 as part of RFC23 to add support for this use case in PMIx, but this is still in progress.

I mainly opened this ticket for wider visibility of the issue and to make the case for MPIR not being removed before PMIx debugger support can do the same as MPIR.

Launching an MPI job using mpirun under the control of a debugger

$ tool mpirun -n 64 ./hello_mpi.exe

A use case which MPIR easily allows, because it is how MPIR works:

  1. launch mpirun under a debugger, for example GDB
  2. enable MPIR, i.e. set MPIR_being_debugged=1
  3. request mpirun to wait at a well defined location, i.e. break MPIR_Breakpoint
  4. where the debugger can read the proctable and attach to individual MPI processes, i.e. print MPIR_proctable@MPIR_proctable_size
  5. release all processes

This is certainly how we use MPIR in Arm DDT and I imagine other debuggers as well.

The advantage of allowing the user to start an MPI job directly with mpirun (under the control of a debugger) is that it is very convenient for users, for example they do not need to learn the idiosyncrasies of how each debugger maps certain mpirun arguments, such as -np, -cpu-set or -host. Or worse, the debugger might not even be aware of certain mpirun arguments and thus will not allow to set them. Ditto for PMIx: A debugger can not be aware of every possible mpirun argument to PMIx parameter mapping and it is also not appropriate to offload PMIx internals to users.

MPIR deprecation warning as seen with GDB only reproducer

Just leaving this here for completeness.

gdb --args mpirun -n 2 ./hello_c
No symbol table is loaded.  Use the "file" command.
GNU gdb (GDB) 7.12.1
...
Reading symbols from mpirun...(no debugging symbols found)...done.
(gdb) start
Temporary breakpoint 1 at 0x400e10
Starting program: /.../mpirun -n 2 ./hello_c
...
Temporary breakpoint 1, 0x0000000000400e10 in main ()
(gdb) set MPIR_being_debugged=1
(gdb) break MPIR_Breakpoint
Breakpoint 2 at 0x7ffff7bab5b0: file orted/orted_submit.c, line 184.
(gdb) cont
Continuing.
...
--------------------------------------------------------------------------
Open MPI has detected that you have attached a debugger to this MPI
job, and that debugger is using the legacy "MPIR" method of
attachment.

Please note that Open MPI has deprecated the "MPIR" debugger
attachment method in favor of the new "PMIx" debugger attchment
mechanisms.

*** This means that future versions of Open MPI may not support the
*** "MPIR" debugger attachment method at all.  Specifically: the
*** debugger you just attached may not work with future versions of
*** Open MPI.

You may wish to contact your debugger vendor to inquire about support
for PMIx-based debugger attachment mechanisms. Meantime, you can
disable this warning by setting the OMPI_MPIR_DO_NOT_WARN envar to 1.
--------------------------------------------------------------------------

Thread 1 "mpirun" hit Breakpoint 2, MPIR_Breakpoint () at orted/orted_submit.c:184
184	}
(gdb) quit
A debugging session is active.

	Inferior 1 [process 372] will be killed.

Quit anyway? (y or n) y

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions