From 6674ba8776fb059f5abdcdc455486d28e0b3fa78 Mon Sep 17 00:00:00 2001 From: Jeff Squyres Date: Sun, 2 Jan 2022 12:25:39 -0500 Subject: [PATCH] README.md: make the heterogeneous support more clear Remove ambiguous warning language (it's not clear if the "THIS FUNCTIONALITY..." warning applies to the option above or below the warning) and make it clear exactly the heterogeneous option supports and does not support. Signed-off-by: Jeff Squyres (cherry picked from commit 0b0bae0ce5e3f6ba239f642473a800df263bac97) --- README.md | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 01c26399159..8280b68efa4 100644 --- a/README.md +++ b/README.md @@ -1502,11 +1502,27 @@ Additionally, if a search directory is specified in the form Enable the PERUSE MPI data analysis interface. * `--enable-heterogeneous`: - Enable support for running on heterogeneous clusters (e.g., machines - with different endian representations). Heterogeneous support is - disabled by default because it imposes a minor performance penalty. - - ***THIS FUNCTIONALITY IS CURRENTLY BROKEN - DO NOT USE*** + Enable support for running on heterogeneous clusters where data + types are equivalent sizes across nodes, but may have differing + endian representations. Heterogeneous support is disabled by + default because it imposes a minor performance penalty. + + Note that the MPI standard *does not guarantee that all + heterogeneous communication will function properly*, **especially + when the conversion between the different representations leads to + loss of accuracy or range.** For example, if a message with a + 16-bit integer datatype is sent with value 0x10000 to a receiver + where the same integer datatype is only 8 bits, the value will be + truncated at the receiver. Similarly, problems can occur if a + floating point datatype in one MPI process uses X1 bits for its + mantissa and Y1 bits for its exponent, but the same floating point + datatype in another MPI process uses X2 and Y2 bits, respectively + (where X1 != X2 and/or Y1 != Y2). Type size differences like this + can lead to unexpected behavior. + + Open MPI's heterogeneous support correctly handles endian + differences between datatype representations that are otherwise + compatible. * `--with-wrapper-cflags=CFLAGS` * `--with-wrapper-cxxflags=CXXFLAGS`