Skip to content

mpicc / mpirun / etc. carry unnecessary external dependencies #9869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bwbarrett opened this issue Jan 13, 2022 · 15 comments
Closed

mpicc / mpirun / etc. carry unnecessary external dependencies #9869

bwbarrett opened this issue Jan 13, 2022 · 15 comments

Comments

@bwbarrett
Copy link
Member

Currently, Open MPI executable (like mpicc/mpirun/etc.) inherit dependencies on BTL communication libraries, like ugni. This leads to problems on some large scale systems, where compute not libraries are not in the default library search paths on the head nodes (I assume they must be available, otherwise linking of applications wouldn't work). For example:

mpicc --version
mpicc: error while loading shared libraries:
libugni.so.0: cannot open shared object file: No such file or directory

This has been a problem with Open MPI since the BTLs moved to OPAL, but is considerably more noticeable with the change to avoid building DSOs by default. #8800 proposed a fix by making components with external dependencies build as DSOs by default, but this defeats the entire reason we build without DSOs by default. Launch scalability with the old behavior was terrible because of the mass DSO loading at launch. The systems likely to run into the library dependency problem are the very ones that need the change in default behavior, and are likely to have many components with external dependencies.

The right solution is probably to move the BTLs back into the OMPI layer, but I assume @bosilca will object to that plan. A second plan, and likely the one we will have to implement, is to split OPAL into two libraries. The first is just the base portability code (with minimal MCA inclusion) that is safe to use on the front-end and the second is the full opal with communication libraries. We already have a bit of this split, in that we have two different initialization routines (opal_init() and opal_init_util()). We just don't expose that split through libraries, leading to linking problems.

@bosilca
Copy link
Member

bosilca commented Jan 18, 2022

The BTLs are an easy target for discussion because they are obviously not necessary when not running an MPI job, but the issue remains true for many other shared libraries (hwloc, cuda).

Let me add a 3rd possible approach, implement the dependencies in OPAL and do lazy loading of all frameworks.

@bwbarrett
Copy link
Member Author

The very reason that we made this change was to avoid dynamic loading to avoid killing large system filesystems. The very problem is lazy loading.

@bosilca
Copy link
Member

bosilca commented Jan 18, 2022

Then I'm not sure I understand what the problem is. I would understand if everything is compiled statically, but it does not seem so, or the example above would not try to load libugni.so. Thus, as long as we depend on any way on a shared library there will be loading storm.

What I am proposing is a little like RTLD_LAZY, but at the level of the OPAL components. Thus, the BTL framework will not be initialized until some other component request it, which will result in less necessary dependencies for our wrappers. We can combine this with the allowing some components to declare themselves as DSO, and we will get the best of both world, no impact on mpicc and little impact on an MPI application.

Or maybe I'm missing some critical point in this entire discussion ?

@bwbarrett
Copy link
Member Author

are you suggesting that we add all the infrastructure to dynamically load libugni? That would be a hell of an ask. Just look at how big of a pain it is to deal with CUDA. To do that with all dependencies seems insane.

In general, I think the right answer is to use the dso infrastructure we already have. However, a small set of components (the BTLs in OMPI, one or two components in PRRTE) cause problems because the large systems we're trying to make scale don't include some system libraries on the interactive nodes. I think that's a bit silly, but whatever. In the case of the utilities and OPAL, keep in mind that we already split the initialization code so that we don't initialize the BTLs for mpicc and friends. Solving the library inclusion problem isn't a huge reach from there.

@bosilca
Copy link
Member

bosilca commented Jan 18, 2022

What I was proposing is to load OPAL frameworks only when needed. But after playing around a little bit, it does not seem like a satisfactory approach.

Thus, after going back and forth with @jsquyres, we might have found al alternative approach, using the libtool convenience libraries to create a libopen-tiny.la that would contain the minimum OPA support, aka. exactly what opal_init_util needs. More to come.

@rhc54
Copy link
Contributor

rhc54 commented Jan 18, 2022

Not sure I get how that solves anything. The 'mpirun' issue is a side one - what we are trying to do is eliminate backend loading of DSOs. Period. It doesn't scale. Brian is only noting that the problem also caused symptoms in mpicc, so the solution should address that too. Solving that problem by a method that results in DSOs being loaded again on the backend just takes us backward.

Just trying to avoid people spending time on the wrong problem.

@jsquyres
Copy link
Member

@bosilca @bwbarrett and I are talking on the phone about this tomorrow -- we'll get it resolved.

@rhc54
Copy link
Contributor

rhc54 commented Jan 19, 2022

FWIW: I've started looking into changing the way we handle this in PMIx/PRRTE. Instead of linking against these dependencies at build, I'm building all components and letting the main library absorb them (which is now our default). I then use dlopen on the backend to see if the required lib is present - if not, disqualify the component. If so, then use dlsym to map the few APIs I actually need.

Eliminates this problem and helps the distros so they don't have to load a bunch of stuff just to build us.

🤷‍♂️ What you folks do in OMPI is up to you.

bosilca added a commit to bosilca/ompi that referenced this issue Jan 19, 2022
The solution put forward here consist of creating a convenience library,
opal-tiny, that holds all the basic things we need to have a working
app with minimal external dependencies.

More detailed discussion in open-mpi#9869.

Signed-off-by: George Bosilca <[email protected]>
bosilca added a commit to bosilca/ompi that referenced this issue Jan 20, 2022
The solution put forward here consist of creating a convenience library,
opal-tiny, that holds all the basic things we need to have a working
app with minimal external dependencies.

More detailed discussion in open-mpi#9869.

Signed-off-by: George Bosilca <[email protected]>
bosilca added a commit to bosilca/ompi that referenced this issue Jan 20, 2022
The solution put forward here consist of creating a convenience library,
opal-tiny, that holds all the basic things we need to have a working
app with minimal external dependencies.

More detailed discussion in open-mpi#9869.

Signed-off-by: George Bosilca <[email protected]>
@jjhursey
Copy link
Member

@rhc54 Will PRTE/OpenPMIx still have the ability to force a component to be built as a DSO via a configure option even if the default is static? Spectrum MPI builds at least one set of components as DSO's (lsf) because we can ship with the DSOs and still work on systems that do not have LSF installed (in which case the linking fails and the component is disqualified).

@rhc54
Copy link
Contributor

rhc54 commented Jan 20, 2022

Yes - it is just the default to not do it. I'm still looking at this in terms of moving from configure-time build decisions to runtime include decisions. Idea is that the component is always built, it checks for the presence of the required library during component open - component disqualifies itself if the library isn't found. Hopefully eliminates the need to install libraries on the machine where the package is built.

Still just a concept - need to investigate more into the overall benefits and impacts. Not something happening soon. Near term work is to remove external dependencies where they aren't absolutely needed.

@bwbarrett
Copy link
Member Author

I unfortunately don't have much OMPI time these days, but the half-done branch is here: https://github.com/bwbarrett/ompi/commits/bugfix/mpicc-dependencies.

@jjhursey jjhursey self-assigned this Aug 31, 2022
@jjhursey
Copy link
Member

I'm going to work on picking this up later this week. @bwbarrett Can you give me a sketch of your technique on your branch?

@bwbarrett
Copy link
Member Author

The end goal is to have libopen-pal.so split into two libraries: libopen-pal_core.la that is not installed, and libopen-pal.la that is installed. mpicc/mpirun/etc. would only link libopen-pal_core.la (note that since it's a noinst library, we don't have to do the renaming for that library). Unfortunately, mpicc/mpirun/etc. need some functionality that is only available via mca components, which complicates the stack.

The first patch in the branch is largely providing a mechanism to split the two MCA layers into stuff that commonly has external dependencies and stuff that doesn't. The second patch tries to split opal into "util" layer that's all the single instance (which tends not to have external package dependencies) and "top" layer that's all the btl and stuff.

It may be that we need to install libopen-pal_core to get the dependencies right in the components we're installing. THat would require dealing with the renaming foo, but otherwise shouldn't be difficult. Make sense?

jjhursey pushed a commit to jjhursey/ompi that referenced this issue Sep 7, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/if`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_util.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'util' set, representing the "core".

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
jjhursey pushed a commit to jjhursey/ompi that referenced this issue Sep 7, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/if`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_util.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'util' set, representing the "core".

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
@jjhursey
Copy link
Member

jjhursey commented Sep 7, 2022

I posted an early preview PR #10779 -- I need to do more testing, but it is looking good so far.

jjhursey pushed a commit to jjhursey/ompi that referenced this issue Sep 13, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/if`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_util.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'util' set, representing the "core".

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
jjhursey pushed a commit to jjhursey/ompi that referenced this issue Sep 14, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_core.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'core' set vs 'main' set

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
jjhursey pushed a commit to jjhursey/ompi that referenced this issue Sep 16, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_core.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'core' set vs 'main' set

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
(cherry picked from commit fe1c384)
@jjhursey
Copy link
Member

wckzhang pushed a commit to wckzhang/ompi that referenced this issue Sep 20, 2022
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_core.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'core' set vs 'main' set

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
yli137 pushed a commit to yli137/ompi that referenced this issue Jan 10, 2024
 * Fixes Issue open-mpi#9869
 * Split `libopen-pal.so` into two libraries:
   - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed.
   - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed.
 * The "core" library contains the following:
   - `opal/class`
   - `opal/mca/backtrace`
   - `opal/mca/dl`
   - `opal/mca/installdirs`
   - `opal/mca/threads`
   - `opal/mca/timer`
   - `opal/runtime/*_core.[c|h]`
   - `opal/runtime/opal_info_support.c`
   - `opal/util (most - see Makefile.am)`
   - `opal/util/keyval`
 * The "core" library is linked into the following tools instead of the full `libopen-pal.so`:
   - `ompi/tools/mpirun`
   - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`)
   - `opal/tools/wrappers`
 * The `opal/runtime` files were divided into a 'core' set vs 'main' set

Co-authored-by: George Bosilca <[email protected]>
Co-authored-by: Brian Barrett <[email protected]>
Signed-off-by: Joshua Hursey <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants