From f64cc79317934429ac37b4d1f71326b822493a9b Mon Sep 17 00:00:00 2001 From: Jeff Squyres Date: Tue, 7 Mar 2017 17:38:59 -0500 Subject: [PATCH] README: bring changes back from v2.x branch Signed-off-by: Jeff Squyres [skip ci] bot:notest --- README | 171 ++++++++++++++++++++------------------------------------- 1 file changed, 59 insertions(+), 112 deletions(-) diff --git a/README b/README index 520a5ba4d4f..7751d5767ee 100644 --- a/README +++ b/README @@ -192,6 +192,11 @@ Compiler Notes f95/g95), or by disabling the Fortran MPI bindings with --disable-mpi-fortran. +- On OpenBSD/i386, if you configure with + --enable-mca-no-build=patcher, you will also need to add + --disable-dlopen. Otherwise, odd crashes can occur + nondeterministically. + - Absoft 11.5.2 plus a service pack from September 2012 (which Absoft says is available upon request), or a version later than 11.5.2 (e.g., 11.5.3), is required to compile the new Fortran mpi_f08 @@ -514,8 +519,8 @@ MPI Functionality and Features This library is being offered as a "proof of concept" / convenience from Open MPI. If there is interest, it is trivially easy to extend - it to printf for other MPI functions. Patches and/or suggestions - would be greatfully appreciated on the Open MPI developer's list. + it to printf for other MPI functions. Pull requests on github.com + would be greatly appreciated. OSHMEM Functionality and Features ------------------------------ @@ -548,41 +553,6 @@ MPI Collectives (FCA) is a solution for offloading collective operations from the MPI process onto Mellanox QDR InfiniBand switch CPUs and HCAs. -- The "ML" coll component is an implementation of MPI collective - operations that takes advantage of communication hierarchies in - modern systems. A ML collective operation is implemented by - combining multiple independently progressing collective primitives - implemented over different communication hierarchies, hence a ML - collective operation is also referred to as a hierarchical - collective operation. The number of collective primitives that are - included in a ML collective operation is a function of - subgroups(hierarchies). Typically, MPI processes in a single - communication hierarchy such as CPU socket, node, or subnet are - grouped together into a single subgroup (hierarchy). The number of - subgroups are configurable at runtime, and each different collective - operation could be configured to have a different of number of - subgroups. - - The component frameworks and components used by/required for a - "ML" collective operation. - - Frameworks: - * "sbgp" - Provides functionality for grouping processes into - subgroups - * "bcol" - Provides collective primitives optimized for a particular - communication hierarchy - - Components: - * sbgp components - Provides grouping functionality over a CPU - socket ("basesocket"), shared memory - ("basesmuma"), Mellanox's ConnectX HCA - ("ibnet"), and other interconnects supported by - PML ("p2p") - * BCOL components - Provides optimized collective primitives for - shared memory ("basesmuma"), Mellanox's ConnectX - HCA ("iboffload"), and other interconnects - supported by PML ("ptpcoll") - - The "cuda" coll component provides CUDA-aware support for the reduction type collectives with GPU buffers. This component is only compiled into the library when the library has been configured with @@ -604,11 +574,11 @@ OSHMEM Collectives Network Support --------------- -- There are three main MPI network models available: "ob1", "cm", and - "yalla". "ob1" uses BTL ("Byte Transfer Layer") components for each - supported network. "cm" uses MTL ("Matching Tranport Layer") - components for each supported network. "yalla" uses the Mellanox - MXM transport. +- There are four main MPI network models available: "ob1", "cm", + "yalla", and "ucx". "ob1" uses BTL ("Byte Transfer Layer") + components for each supported network. "cm" uses MTL ("Matching + Tranport Layer") components for each supported network. "yalla" + uses the Mellanox MXM transport. "ucx" uses the OpenUCX transport. - "ob1" supports a variety of networks that can be used in combination with each other: @@ -644,7 +614,7 @@ Network Support - Similarly, there are two OSHMEM network models available: "yoda", and "ikrit". "yoda" also uses the BTL components for many supported - network. "ikrit" interfaces directly with Mellanox MXM. + networks. "ikrit" interfaces directly with Mellanox MXM. - "yoda" supports a variety of networks that can be used: @@ -652,6 +622,7 @@ Network Support - Loopback (send-to-self) - Shared memory - TCP + - usNIC - "ikrit" only supports Mellanox MXM. @@ -668,7 +639,7 @@ Network Support - The usnic BTL is support for Cisco's usNIC device ("userspace NIC") on Cisco UCS servers with the Virtualized Interface Card (VIC). Although the usNIC is accessed via the OpenFabrics Libfabric API - stack, this BTL is specific to the Cisco usNIC device. + stack, this BTL is specific to Cisco usNIC devices. - uGNI is a Cray library for communicating over the Gemini and Aries interconnects. @@ -700,9 +671,9 @@ Network Support Open MPI Extensions ------------------- -- An MPI "extensions" framework has been added (but is not enabled by - default). See the "Open MPI API Extensions" section below for more - information on compiling and using MPI extensions. +- An MPI "extensions" framework is included in Open MPI, but is not + enabled by default. See the "Open MPI API Extensions" section below + for more information on compiling and using MPI extensions. - The following extensions are included in this version of Open MPI: @@ -726,10 +697,9 @@ Building Open MPI Open MPI uses a traditional configure script paired with "make" to build. Typical installs can be of the pattern: ---------------------------------------------------------------------------- shell$ ./configure [...options...] -shell$ make all install ---------------------------------------------------------------------------- +shell$ make [-j N] all install + (use an integer value of N for parallel builds) There are many available configure options (see "./configure --help" for a full list); a summary of the more commonly used ones is included @@ -752,16 +722,16 @@ INSTALLATION OPTIONS files in /include, its libraries in /lib, etc. --disable-shared - By default, libmpi and libshmem are built as a shared library, and - all components are built as dynamic shared objects (DSOs). This - switch disables this default; it is really only useful when used with + By default, Open MPI and OpenSHMEM build shared libraries, and all + components are built as dynamic shared objects (DSOs). This switch + disables this default; it is really only useful when used with --enable-static. Specifically, this option does *not* imply --enable-static; enabling static libraries and disabling shared libraries are two independent options. --enable-static - Build libmpi and libshmem as static libraries, and statically link in all - components. Note that this option does *not* imply + Build Open MPI and OpenSHMEM as static libraries, and statically + link in all components. Note that this option does *not* imply --disable-shared; enabling static libraries and disabling shared libraries are two independent options. @@ -838,7 +808,7 @@ NETWORKING SUPPORT / OPTIONS Specify the directory where the Mellanox FCA library and header files are located. - FCA is the support library for Mellanox QDR switches and HCAs. + FCA is the support library for Mellanox switches and HCAs. --with-hcoll= Specify the directory where the Mellanox hcoll library and header @@ -867,7 +837,8 @@ NETWORKING SUPPORT / OPTIONS compiler/linker search paths. Libfabric is the support library for OpenFabrics Interfaces-based - network adapters, such as Cisco usNIC, Intel True Scale PSM, etc. + network adapters, such as Cisco usNIC, Intel True Scale PSM, Cray + uGNI, etc. --with-libfabric-libdir= Look in directory for the libfabric libraries. By default, Open MPI @@ -939,13 +910,14 @@ NETWORKING SUPPORT / OPTIONS Look in directory for Intel SCIF support libraries --with-verbs= - Specify the directory where the verbs (also know as OpenFabrics, and - previously known as OpenIB) libraries and header files are located. - This option is generally only necessary if the verbs headers and - libraries are not in default compiler/linker search paths. + Specify the directory where the verbs (also known as OpenFabrics + verbs, or Linux verbs, and previously known as OpenIB) libraries and + header files are located. This option is generally only necessary + if the verbs headers and libraries are not in default + compiler/linker search paths. - "OpenFabrics" refers to operating system bypass networks, such as - InfiniBand, usNIC, iWARP, and RoCE (aka "IBoIP"). + The Verbs library usually implies operating system bypass networks, + such as InfiniBand, usNIC, iWARP, and RoCE (aka "IBoIP"). --with-verbs-libdir= Look in directory for the verbs libraries. By default, Open MPI @@ -981,9 +953,6 @@ RUN-TIME SYSTEM SUPPORT path names. --enable-orterun-prefix-by-default is a synonym for this option. ---enable-sensors - Enable internal sensors (default: disabled). - --enable-orte-static-ports Enable orte static ports for tcp oob (default: enabled). @@ -1163,12 +1132,6 @@ MPI FUNCTIONALITY --enable-mpi-thread-multiple Allows the MPI thread level MPI_THREAD_MULTIPLE. - This is currently disabled by default. Enabling - this feature will automatically --enable-opal-multi-threads. - ---enable-opal-multi-threads - Enables thread lock support in the OPAL and ORTE layers. Does - not enable MPI_THREAD_MULTIPLE - see above option for that feature. This is currently disabled by default. --enable-mpi-cxx @@ -1246,11 +1209,6 @@ MISCELLANEOUS FUNCTIONALITY However, it may be necessary to disable the memory manager in order to build Open MPI statically. ---with-ft=TYPE - Specify the type of fault tolerance to enable. Options: LAM - (LAM/MPI-like), cr (Checkpoint/Restart). Fault tolerance support is - disabled unless this option is specified. - --enable-peruse Enable the PERUSE MPI data analysis interface. @@ -1476,25 +1434,14 @@ The "A.B.C" version number may optionally be followed by a Quantifier: Nightly development snapshot tarballs use a different version number scheme; they contain three distinct values: - * The most recent Git tag name on the branch from which the tarball - was created. - * An integer indicating how many Git commits have occurred since - that Git tag. - * The Git hash of the tip of the branch. + * The git branch name from which the tarball was created. + * The date/timestamp, in YYYYMMDDHHMM format. + * The hash of the git commit from which the tarball was created. For example, a snapshot tarball filename of -"openmpi-v1.8.2-57-gb9f1fd9.tar.bz2" indicates that this tarball was -created from the v1.8 branch, 57 Git commits after the "v1.8.2" tag, -specifically at Git hash gb9f1fd9. - -Open MPI's Git master branch contains a single "dev" tag. For -example, "openmpi-dev-8-gf21c349.tar.bz2" represents a snapshot -tarball created from the master branch, 8 Git commits after the "dev" -tag, specifically at Git hash gf21c349. - -The exact value of the "number of Git commits past a tag" integer is -fairly meaningless; its sole purpose is to provide an easy, -human-recognizable ordering for snapshot tarballs. +"openmpi-v2.x-201703070235-e4798fb.tar.gz" indicates that this tarball +was created from the v2.x branch, on March 7, 2017, at 2:35am GMT, +from git hash e4798fb. Shared Library Version Number ----------------------------- @@ -1816,11 +1763,10 @@ Open MPI supports oshrun to launch OSHMEM applications. For example: OSHMEM applications may also be launched directly by resource managers such as SLURM. For example, when OMPI is configured --with-pmi and ---with-slurm one may launch OSHMEM applications via srun: +--with-slurm, one may launch OSHMEM applications via srun: shell$ srun -N 2 hello_world_oshmem - =========================================================================== The Modular Component Architecture (MCA) @@ -1834,7 +1780,6 @@ component frameworks in Open MPI: MPI component frameworks: ------------------------- -bcol - Base collective operations bml - BTL management layer coll - MPI collective algorithms fbtl - file byte transfer layer: abstraction for individual @@ -1848,7 +1793,6 @@ op - Back end computations for intrinsic MPI_Op operators osc - MPI one-sided communications pml - MPI point-to-point management layer rte - Run-time environment operations -sbgp - Collective operation sub-group sharedfp - shared file pointer operations for MPI I/O topo - MPI topology routines vprotocol - Protocols for the "v" PML @@ -1892,7 +1836,6 @@ Miscellaneous frameworks: allocator - Memory allocator backtrace - Debugging call stack backtrace support btl - Point-to-point Byte Transfer Layer -compress - Compression algorithms dl - Dynamic loading library interface event - Event library (libevent) versioning support hwloc - Hardware locality (hwloc) versioning support @@ -1924,8 +1867,8 @@ to see what its tunable parameters are. For example: shell$ ompi_info --param btl tcp -shows a some of parameters (and default values) for the tcp btl -component. +shows some of the parameters (and default values) for the tcp btl +component (use "--level 9" to show *all* the parameters; see below). Note that ompi_info only shows a small number a component's MCA parameters by default. Each MCA parameter has a "level" value from 1 @@ -2008,10 +1951,10 @@ variable; an environment variable will override the system-wide defaults. Each component typically activates itself when relevant. For example, -the MX component will detect that MX devices are present and will -automatically be used for MPI communications. The SLURM component -will automatically detect when running inside a SLURM job and activate -itself. And so on. +the usNIC component will detect that usNIC devices are present and +will automatically be used for MPI communications. The SLURM +component will automatically detect when running inside a SLURM job +and activate itself. And so on. Components can be manually activated or deactivated if necessary, of course. The most common components that are manually activated, @@ -2025,10 +1968,14 @@ comma-delimited list to the "btl" MCA parameter: shell$ mpirun --mca btl tcp,self hello_world_mpi -To add shared memory support, add "sm" into the command-delimited list -(list order does not matter): +To add shared memory support, add "vader" into the command-delimited +list (list order does not matter): + + shell$ mpirun --mca btl tcp,vader,self hello_world_mpi - shell$ mpirun --mca btl tcp,sm,self hello_world_mpi +(there is an "sm" shared memory BTL, too, but "vader" is a newer +generation of shared memory support; by default, "vader" will be used +instead of "sm") To specifically deactivate a specific component, the comma-delimited list can be prepended with a "^" to negate it: @@ -2073,10 +2020,10 @@ user's list: http://lists.open-mpi.org/mailman/listinfo/users Developer-level bug reports, questions, and comments should generally -be sent to the developer's mailing list (devel@lists.open-mpi.org). Please -do not post the same question to both lists. As with the user's list, -only subscribers are allowed to post to the developer's list. Visit -the following web page to subscribe: +be sent to the developer's mailing list (devel@lists.open-mpi.org). +Please do not post the same question to both lists. As with the +user's list, only subscribers are allowed to post to the developer's +list. Visit the following web page to subscribe: http://lists.open-mpi.org/mailman/listinfo/devel