-
Notifications
You must be signed in to change notification settings - Fork 900
Incorrect behaviour with certain extent values for simple derived datatypes #6899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report! There is indeed an issue in Open MPI. Note the test program you posted is incorrect with respect to the MPI standard since overlapping derived datatypes cannot be used in Also, there is a potential deadlock caused by two I rewrote the test in #include <stdio.h>
#include <mpi.h>
int main(int argc, char *argv[]) {
const int n = 8;
int input[2*n];
int output[2*n];
int i, iproc;
MPI_Datatype sendtype, tmp1, tmp2;
MPI_Aint lb, ext;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &iproc);
if (0 == iproc) {
for (i=0; i<2*n; i++) {
output[i] = -1;
input[i] = i;
}
MPI_Type_contiguous(n, MPI_INT, &tmp1);
MPI_Type_get_extent(MPI_INT, &lb, &ext);
lb = 0;
ext *= 2;
MPI_Type_create_resized(tmp1, lb, ext, &tmp2);
MPI_Type_vector(2, 1, 1, tmp2, &sendtype);
MPI_Type_commit(&sendtype);
MPI_Sendrecv(input, 1, sendtype, 0, 0,
output, 2*n, MPI_INT, 0, 0,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("With extent = %ld\n", ext);
for (i=0; i<2*n; i++) {
printf ("%d ", output[i]);
}
printf("\n");
MPI_Type_get_true_extent(sendtype, &lb, &ext);
printf("true_extent = %ld\n", ext);
MPI_Type_free(&tmp1);
MPI_Type_free(&tmp2);
MPI_Type_free(&sendtype);
for (i=0; i<2*n; i++) {
output[i] = -1;
input[i] = i;
}
MPI_Type_contiguous(n, MPI_INT, &tmp1);
MPI_Type_get_extent(MPI_INT, &lb, &ext);
lb = 0;
ext *= 1;
MPI_Type_create_resized(tmp1, lb, ext, &tmp2);
MPI_Type_vector(2, 1, 1, tmp2, &sendtype);
MPI_Type_commit(&sendtype);
MPI_Sendrecv(input, 1, sendtype, 0, 0,
output, 2*n, MPI_INT, 0, 0,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
printf("With extent = %ld\n", ext);
for (i=0; i<2*n; i++) {
printf ("%d ", output[i]);
}
printf("\n");
MPI_Type_get_true_extent(sendtype, &lb, &ext);
printf("true_extent = %ld\n", ext);
MPI_Type_free(&tmp1);
MPI_Type_free(&tmp2);
MPI_Type_free(&sendtype);
}
MPI_Finalize();
return 0;
} |
Thanks for the fast response and confirming there's an issue -- I had indeed wondered what would happen if an overlapping receive was used due to this non-deterministic behaviour but didn't realise it was prohibited in the standard. Indeed I don't really want to create an overlapping receive (or send) type but just came across this in testing to try to understand why a more complicated type wasn't working. (I thought my use of non-blocking send and receive would guard against a deadlock but I hadn't thought about it carefully enough in the context of sending and receiving to the same rank but could imagine how this might cause an issue.) |
on second thought, let me take back the part about the deadlock. |
Not quite, the datatype engine works perfectly according to the data description. Let me walk over the type map of your datatype. First let me drop here the code related to the datatype creation, with all variables substituted with their corresponding values. MPI_Type_contiguous(8, MPI_INT, &tmp1);
MPI_Type_create_resized(tmp1, 0, 2*4, &tmp2);
MPI_Type_vector(2, 1, 1, tmp2, &sendtype); First we have a contiguous type with the typemap ((INT, 0), (INT, 4), (INT, 8), (INT, 12), (INT, 16), (INT, 20), (INT, 24), (INT, 28)). Then you move the extent of this datatype based on the extent of the base type (INT), which leaves you with the same typemap for tmp2 but now an extent of 8 instead of 32 (8*4). So, when you repeat the tmp2 type twice you get:
So, if we assume one is allowed to receive using this datatype (just to be clear this is not the case in the current MPI standard), and the input data is received as a contiguous array, you should get exactly you expected originally. I think the root issue here is a slight misunderstanding of the MPI datatype, the extent does not remove elements, it just move the "reading head" (see it as in file operations) by that amount before the next read. Thus, as the tmp1 datatype contains 8 elements they will all be visible for the first vector before going back and sending the second vector from the same pointer plus the 8 bytes extent. |
I certainly expect that I don't fully understand the MPI datatype, but the example type map you give above is indeed exactly what I expect to see and indeed when the extent is twice the base type extent the code behaves exactly as expected and as described by that type map. It's when the extent is set to only one times the base type that the unexpected behaviour seems to occur. Given the above I think this should give a type map of (INT, 0), (INT, 4), (INT, 8), (INT, 12), (INT, 16), (INT, 20), (INT, 24), (INT, 28)
(INT, 4), (INT, 8), (INT, 12), (INT, 16), (INT, 20), (INT, 24), (INT, 28), (INT, 32) but it seems in practice to produce something like (INT, 0), (INT, 4), (INT, 8), (INT, 12), (INT, 16), (INT, 20), (INT, 24), (INT, 28)
(INT, 32), (INT, 36), (INT, 40), (INT, 44), (INT, 48), (INT, 52), (INT, 56), (INT, 60) The updated example provided by @ggouaillardet which switches to a custom send datatype may make this clearer. |
I looked at the datatype description and it looks exactly how it should, and I also looked at the generated output (with @ggouaillardet test) and it also look exactly how it should.
The first case you print 8 elements and then reprint them by skipping the first 2 (because the extent of 8). On the second you do the same but only skip the first int (extent being 4). |
@d7919 the typemap applies to the array used as send buffer right (because the sendrecv uses a contiguous buffer as the receive buffer) ? If the typemap you mention would be what the OMPI datatype would have generated then the output would have been
and not
that we currently see for an extent of 4. |
In fact |
Is there a discrepancy between the v4.0.x releases / branch and master? |
In addition to the versions I tested with (2.1.1 and 3.1.1) a colleague has just tested with 4.0.1 and reported the same unexpected result. |
I was indeed testing with master. All stables seems to have this issue with the extent 4, while they work with extent 8. Troubling. For the 4.x there is an update pending (#6863). |
It turns out that if a complex datatype built out of a single predefined type has been resized to the size of its predefined type, we miscompute the extent when creating other datatypes with the type. Provides a fix for open-mpi#6899 for the 3.x branch. Signed-off-by: George Bosilca <[email protected]>
Fix for the 3.x branch pending #6901. diff --git a/opal/datatype/opal_datatype_add.c b/opal/datatype/opal_datatype_add.c
index 890f550..0013c8e 100644
--- a/opal/datatype/opal_datatype_add.c
+++ b/opal/datatype/opal_datatype_add.c
@@ -3,7 +3,7 @@
* Copyright (c) 2004-2006 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
- * Copyright (c) 2004-2016 The University of Tennessee and The University
+ * Copyright (c) 2004-2019 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2006 High Performance Computing Center Stuttgart,
@@ -297,7 +297,8 @@ int32_t opal_datatype_add( opal_datatype_t* pdtBase, const opal_datatype_t* pd
if( pdtAdd->btypes[i] != 0 ) pdtBase->btypes[i] += (count * pdtAdd->btypes[i]);
if( (1 == pdtAdd->desc.used) && (extent == (pdtAdd->ub - pdtAdd->lb)) &&
- (extent == pdtAdd->desc.desc[0].elem.extent) ){
+ (extent == pdtAdd->desc.desc[0].elem.extent) &&
+ (extent == (pdtAdd->true_ub - pdtAdd->true_lb)) ) {
pLast->elem = pdtAdd->desc.desc[0].elem;
pLast->elem.count *= count;
pLast->elem.disp += disp; |
It turns out that if a complex datatype built out of a single predefined type has been resized to the size of its predefined type, we miscompute the extent when creating other datatypes with the type. Provides a fix for open-mpi#6899 for the 3.x branch. Signed-off-by: George Bosilca <[email protected]> (cherry picked from commit f09d011)
Ok, this is now merged to all the branches on which we're going to fix it: v3.0.x, v3.1.x, v4.0.x. Closing. |
Background information
I am attempting to construct a moderately complicated derived datatype but have been running into difficulties so have tried to strip back to a minimal case and in doing so I have come across what looks like incorrect behaviour. I have constructed a minimal example which can be accessed in this gist.
I first make a contiguous type corresponding to the first half of the message, next I use
mpi_type_create_resized
to change the extent, with the intention that I can make the start of the second half of the message overlap the first half. When the extent corresponds to two elements of the base type this works as expected - I see the first two elements of the first half of the message and then the entirety of the second message. If I change the extent to correspond to one element of the base type instead of seeing just the first element of the first half followed by the entirety of the second half we instead get the entire first half of the message followed by the entire second half -- in other words it is as if I hadn't set the extent at all. This test case is only using one processor.I would have expected the output to look like
but instead I find
A version using
mpi_f08
demonstrates identical behaviour. Similarly changing thebase_type
from integer to real, complex etc. all behave in the same way.What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
I've tested this with Open MPI v2.1.1 and Open MPI v3.1.1 and get the same output in both cases. These are on systems I don't manage so I'm not 100% sure how they were installed, but I'm fairly confident that the v2.1.1 has been installed from a package manager whilst v3.1.1 was installed from tarball. I've tried using Intel MPI (2018) and I get the expected behaviour for all extents.
Please describe the system on which you are running
The text was updated successfully, but these errors were encountered: