Skip to content

Refresh of the datatype engine from Topic/backport 6695 #6863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 21, 2019

Conversation

bosilca
Copy link
Member

@bosilca bosilca commented Aug 5, 2019

A backport of the datatype improvements on the 4.0.

Few things to mention:

  • I had to pull few other commits to solve some of the conflicts, but the changes are minor and mostly necessary anyway. The 2 commits I imported in addition to the original PR are 4211925 and d141bf7.
  • I had to manually alter one of the patches to work around the atomic types added in 000f9ee and that have not been yet backported to the stable branch.

bosilca added 11 commits August 5, 2019 09:33
Move toward a base type of vector (count, type, blocklen, extent, disp)
with disp and extent applying toward the count repertition and blocklen
being a contiguous memory of type type.
Implement 2 optimizations on this description used during type_commit:
- collapse: successive similar datatype descriptions are collapsed
together with an increased count.
- fusion: fuse successive datatype descriptions in order to minimize the
number of resulting memcpy during pack/unpack.

Fixes at the OMPI datatype level including:
 - Fix the create_hindexed and vector creation.
 - Fix the handling of [get|set]_elements and _count.
 - Correctly compute the dispacement for block indexed types.
 - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT.

Signed-off-by: George Bosilca <[email protected]>
Update the comments to better reflect what is going on.
Minor indentations.

Signed-off-by: George Bosilca <[email protected]>
Merge contiguous iov in order to minimize the number of returned iovec.

Signed-off-by: George Bosilca <[email protected]>
Rework the to_self test to be able to be used as a benchmark.

Signed-off-by: George Bosilca <[email protected]>
- optimize handling of contiguous with gaps datatypes.
- fixes a performance issue for all datatypes with a count of 1.
- optimize the pack/unpack of contiguous with gaps datatype.
- optimize the case of blocklen == 1

Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: George Bosilca <[email protected]>
Upon detecting a datatype loop representation skip the entire loop
according the the remaining space.

Signed-off-by: George Bosilca <[email protected]>
Optimize contiguous loops by collapsing them into a single element.
During datatype optimization collapse similar elements into larger
blocks.

Signed-off-by: George Bosilca <[email protected]>
Amazing how a bad instruction scheduling can have such a drastic impact
on the code performance. With this change, the get a boost of at least
50% on the performance of data with a small blocklen and/or count.

Signed-off-by: George Bosilca <[email protected]>
Start optimizing the code.

This commit divides the operations in 2 parts, the first, outside the
critical part, deals with partial blocks of predefined elements, and the
second, inside the critical path, only deals with full blocks of
elements. This reduces the number of expensive operations in the
critical path and results in a decent performance increase.

Signed-off-by: George Bosilca <[email protected]>
@gpaulsen
Copy link
Member

gpaulsen commented Aug 5, 2019

Thanks @bosilca!

@jjhursey
Copy link
Member

jjhursey commented Aug 5, 2019

bot:ibm:gnu:retest

@open-mpi open-mpi deleted a comment from ibm-ompi Aug 5, 2019
@hppritcha hppritcha added the NEWS label Aug 5, 2019
@hppritcha hppritcha added this to the v4.0.2 milestone Aug 5, 2019
@gpaulsen
Copy link
Member

gpaulsen commented Aug 5, 2019

Fixes: #5540
Master PR: #6695

@gpaulsen gpaulsen changed the title Topic/backport 6695 Refresh of the datatype engine from Topic/backport 6695 Aug 5, 2019
@gpaulsen
Copy link
Member

gpaulsen commented Aug 8, 2019

@derbeyn @ggouaillardet Can either of you please review this v4.0.x backport of PR #6695 Please?

@gpaulsen gpaulsen requested a review from ggouaillardet August 8, 2019 21:41
@gpaulsen
Copy link
Member

gpaulsen commented Aug 8, 2019

Hmm. It won't let me request a review from @derbeyn even though she reviewed the master PR.

@gpaulsen
Copy link
Member

@ggouaillardet Are you able to review this PR? Once this goes in, we may be able to create a v4.0.2 rc1.

Fixes the convertor iovec description on the MPI-IO reported by Edgar.

Signed-off-by: George Bosilca <[email protected]>
No code or logic changes.

Signed-off-by: George Bosilca <[email protected]>
Signed-off-by: Jeff Squyres <[email protected]>
@hppritcha
Copy link
Member

bot:ompi:retest

@gpaulsen
Copy link
Member

@ggouaillardet Can you please review this? This is the only PR blocking a v4.0.2 rc1 build.
Thanks!

@gpaulsen gpaulsen merged commit 390e0bc into open-mpi:v4.0.x Aug 21, 2019
@derbeyn
Copy link
Contributor

derbeyn commented Aug 26, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants