-
Notifications
You must be signed in to change notification settings - Fork 897
mpicc / mpirun / etc. carry unnecessary external dependencies #9869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The BTLs are an easy target for discussion because they are obviously not necessary when not running an MPI job, but the issue remains true for many other shared libraries (hwloc, cuda). Let me add a 3rd possible approach, implement the dependencies in OPAL and do lazy loading of all frameworks. |
The very reason that we made this change was to avoid dynamic loading to avoid killing large system filesystems. The very problem is lazy loading. |
Then I'm not sure I understand what the problem is. I would understand if everything is compiled statically, but it does not seem so, or the example above would not try to load What I am proposing is a little like RTLD_LAZY, but at the level of the OPAL components. Thus, the BTL framework will not be initialized until some other component request it, which will result in less necessary dependencies for our wrappers. We can combine this with the allowing some components to declare themselves as DSO, and we will get the best of both world, no impact on mpicc and little impact on an MPI application. Or maybe I'm missing some critical point in this entire discussion ? |
are you suggesting that we add all the infrastructure to dynamically load libugni? That would be a hell of an ask. Just look at how big of a pain it is to deal with CUDA. To do that with all dependencies seems insane. In general, I think the right answer is to use the dso infrastructure we already have. However, a small set of components (the BTLs in OMPI, one or two components in PRRTE) cause problems because the large systems we're trying to make scale don't include some system libraries on the interactive nodes. I think that's a bit silly, but whatever. In the case of the utilities and OPAL, keep in mind that we already split the initialization code so that we don't initialize the BTLs for mpicc and friends. Solving the library inclusion problem isn't a huge reach from there. |
What I was proposing is to load OPAL frameworks only when needed. But after playing around a little bit, it does not seem like a satisfactory approach. Thus, after going back and forth with @jsquyres, we might have found al alternative approach, using the libtool convenience libraries to create a libopen-tiny.la that would contain the minimum OPA support, aka. exactly what |
Not sure I get how that solves anything. The 'mpirun' issue is a side one - what we are trying to do is eliminate backend loading of DSOs. Period. It doesn't scale. Brian is only noting that the problem also caused symptoms in Just trying to avoid people spending time on the wrong problem. |
@bosilca @bwbarrett and I are talking on the phone about this tomorrow -- we'll get it resolved. |
FWIW: I've started looking into changing the way we handle this in PMIx/PRRTE. Instead of linking against these dependencies at build, I'm building all components and letting the main library absorb them (which is now our default). I then use Eliminates this problem and helps the distros so they don't have to load a bunch of stuff just to build us. 🤷♂️ What you folks do in OMPI is up to you. |
The solution put forward here consist of creating a convenience library, opal-tiny, that holds all the basic things we need to have a working app with minimal external dependencies. More detailed discussion in open-mpi#9869. Signed-off-by: George Bosilca <[email protected]>
The solution put forward here consist of creating a convenience library, opal-tiny, that holds all the basic things we need to have a working app with minimal external dependencies. More detailed discussion in open-mpi#9869. Signed-off-by: George Bosilca <[email protected]>
The solution put forward here consist of creating a convenience library, opal-tiny, that holds all the basic things we need to have a working app with minimal external dependencies. More detailed discussion in open-mpi#9869. Signed-off-by: George Bosilca <[email protected]>
@rhc54 Will PRTE/OpenPMIx still have the ability to force a component to be built as a DSO via a configure option even if the default is static? Spectrum MPI builds at least one set of components as DSO's (lsf) because we can ship with the DSOs and still work on systems that do not have LSF installed (in which case the linking fails and the component is disqualified). |
Yes - it is just the default to not do it. I'm still looking at this in terms of moving from configure-time build decisions to runtime include decisions. Idea is that the component is always built, it checks for the presence of the required library during component open - component disqualifies itself if the library isn't found. Hopefully eliminates the need to install libraries on the machine where the package is built. Still just a concept - need to investigate more into the overall benefits and impacts. Not something happening soon. Near term work is to remove external dependencies where they aren't absolutely needed. |
I unfortunately don't have much OMPI time these days, but the half-done branch is here: https://github.com/bwbarrett/ompi/commits/bugfix/mpicc-dependencies. |
I'm going to work on picking this up later this week. @bwbarrett Can you give me a sketch of your technique on your branch? |
The end goal is to have libopen-pal.so split into two libraries: libopen-pal_core.la that is not installed, and libopen-pal.la that is installed. mpicc/mpirun/etc. would only link libopen-pal_core.la (note that since it's a noinst library, we don't have to do the renaming for that library). Unfortunately, mpicc/mpirun/etc. need some functionality that is only available via mca components, which complicates the stack. The first patch in the branch is largely providing a mechanism to split the two MCA layers into stuff that commonly has external dependencies and stuff that doesn't. The second patch tries to split opal into "util" layer that's all the single instance (which tends not to have external package dependencies) and "top" layer that's all the btl and stuff. It may be that we need to install libopen-pal_core to get the dependencies right in the components we're installing. THat would require dealing with the renaming foo, but otherwise shouldn't be difficult. Make sense? |
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/if` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_util.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'util' set, representing the "core". Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/if` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_util.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'util' set, representing the "core". Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
I posted an early preview PR #10779 -- I need to do more testing, but it is looking good so far. |
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/if` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_util.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'util' set, representing the "core". Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_core.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'core' set vs 'main' set Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_core.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'core' set vs 'main' set Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]> (cherry picked from commit fe1c384)
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_core.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'core' set vs 'main' set Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
* Fixes Issue open-mpi#9869 * Split `libopen-pal.so` into two libraries: - `libopen-pal_core.la` : Internal "core" portion of OPAL containing the essential source and MCA needed for mpicc/mpirun tools to link against. The "core" library is not installed. - `libopen-pal.la` : Includes "core" plus all of the other OPAL project sources. The `.so` version of this is installed. * The "core" library contains the following: - `opal/class` - `opal/mca/backtrace` - `opal/mca/dl` - `opal/mca/installdirs` - `opal/mca/threads` - `opal/mca/timer` - `opal/runtime/*_core.[c|h]` - `opal/runtime/opal_info_support.c` - `opal/util (most - see Makefile.am)` - `opal/util/keyval` * The "core" library is linked into the following tools instead of the full `libopen-pal.so`: - `ompi/tools/mpirun` - `ompi/tools/wrappers` (by extension of `opal/tools/wrappers`) - `opal/tools/wrappers` * The `opal/runtime` files were divided into a 'core' set vs 'main' set Co-authored-by: George Bosilca <[email protected]> Co-authored-by: Brian Barrett <[email protected]> Signed-off-by: Joshua Hursey <[email protected]>
Currently, Open MPI executable (like mpicc/mpirun/etc.) inherit dependencies on BTL communication libraries, like ugni. This leads to problems on some large scale systems, where compute not libraries are not in the default library search paths on the head nodes (I assume they must be available, otherwise linking of applications wouldn't work). For example:
This has been a problem with Open MPI since the BTLs moved to OPAL, but is considerably more noticeable with the change to avoid building DSOs by default. #8800 proposed a fix by making components with external dependencies build as DSOs by default, but this defeats the entire reason we build without DSOs by default. Launch scalability with the old behavior was terrible because of the mass DSO loading at launch. The systems likely to run into the library dependency problem are the very ones that need the change in default behavior, and are likely to have many components with external dependencies.
The right solution is probably to move the BTLs back into the OMPI layer, but I assume @bosilca will object to that plan. A second plan, and likely the one we will have to implement, is to split OPAL into two libraries. The first is just the base portability code (with minimal MCA inclusion) that is safe to use on the front-end and the second is the full opal with communication libraries. We already have a bit of this split, in that we have two different initialization routines (
opal_init()
andopal_init_util()
). We just don't expose that split through libraries, leading to linking problems.The text was updated successfully, but these errors were encountered: