Skip to content

opal/datatype: do not compute ptypes for OPAL predefined datatypes #3526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 17, 2017

Conversation

ggouaillardet
Copy link
Contributor

Fixes #3522

Signed-off-by: Gilles Gouaillardet [email protected]

@ggouaillardet
Copy link
Contributor Author

@bosilca can you please review and merge if ok ?

OPAL predefined datatypes are declared as const, and hence the ptypes should not be set since this might be mapped in a read-only region (for example when configure'd with --enable-static)
also, computing the ptypes is an overkill here, so this patch is also an optimization.
(if you think an OPAL_LIKELY() is a good fit here, i will update the PR)

@jsquyres FYI

@jsquyres
Copy link
Member

@ggouaillardet Ah, now I understand your earlier comments on #3522 better. Thanks.

@ggouaillardet
Copy link
Contributor Author

@jsquyres next time i will try to use more words ...

@bosilca
Copy link
Member

bosilca commented May 15, 2017

This is strange. We alter the content of the OPAL predefined datatypes in opal_datatype_init (via the opal_datatype_basicDatatypes array), and that part of the code didn't trigger any segfault.

@ggouaillardet
Copy link
Contributor Author

@bosilca this is ok for me

    for( i = OPAL_DATATYPE_FIRST_TYPE; i < OPAL_DATATYPE_MAX_PREDEFINED; i++ ) {
        datatype = opal_datatype_basicDatatypes[i];

        /* All of the predefined OPAL types don't have any GAPS! */
        datatype->desc.desc[0].elem.common.flags = OPAL_DATATYPE_FLAG_PREDEFINED |
                                                   OPAL_DATATYPE_FLAG_DATA |
                                                   OPAL_DATATYPE_FLAG_CONTIGUOUS |
                                                   OPAL_DATATYPE_FLAG_NO_GAPS;
        ...

datatype points to a read-only region but datatype->desc.desc is a statically initialized once for all pointer to somewhere in opal_datatype_predefined_elem_desc, and that is not read-only memory

OPAL_DECLSPEC extern union dt_elem_desc opal_datatype_predefined_elem_desc[2 * OPAL_DATATYPE_MAX_PREDEFINED];

since we do not modify the pointer (read only) but the pointed object (read/write), this is ok.

makes sense ?

@bosilca
Copy link
Member

bosilca commented May 16, 2017

thanks @ggouaillardet for pointing this out. Indeed most of the datatype is const with the exception of the desc and which points to a non-const array.

I also found an issue in my code in the OPAL_DATATYPE_INIT_PTYPES_ARRAY_UNAVAILABLE macro. It should be (size_t[OPAL_DATATYPE_MAX_PREDEFINED]){ [OPAL_DATATYPE_MAX_PREDEFINED-1] = 0 } instead of NULL. So, we have 2 choices:

  • we accept NULL ptypes for all OPAL predefined datatypes (this would not work for composed datatypes). This saves some memory for all the predefined types, but will require @ggouaillardet test.
  • we have an array of ptypes for all OPAL predefined datatypes, and we use this array when computing the ptypes for other types.

Looking at this I wonder if we don't have a bug in opal_datatype_compute_ptypes for predefined composed datatype.

@ggouaillardet
Copy link
Contributor Author

@bosilca what do you mean by "predefined composed datatypes" ?

MPI_SHORT_INT and friends are predefined from an ompi point of view, but they are not from an opal point of view. unless i am missing something, NULL ptypes looks good to me for opal predefined datatypes

Copy link
Member

@bosilca bosilca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch fixes all known pb. 👍

@ggouaillardet ggouaillardet merged commit c4f64c3 into open-mpi:master May 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants