Skip to content

Improve MPI_Waitall performance for MPI_THREAD_MULTIPLE #9291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 19, 2021

Conversation

awlauria
Copy link
Contributor

Avoid atomic cmpxchng operations for MPI requests that are already
complete. This improves the performance in message rate benchmarks.

Signed-off-by: Austen Lauria [email protected]

Avoid atomic cmpxchng operations for MPI requests that are already
complete. This improves the performance in message rate benchmarks.

Signed-off-by: Austen Lauria <[email protected]>
@bosilca
Copy link
Member

bosilca commented Aug 18, 2021

I would like to see some benchmark data before this gets merged. Unfortunately, our master is currently broken for shared memory (which is the BTL that should show the most benefit with this patch).

@bosilca
Copy link
Member

bosilca commented Aug 19, 2021

I was able to run NetPIPE-5.1 over shared memory on M1 and x86_64 and aside a certain level of variability the impact seems to be irrelevant (at best I am seeing about 1 nanosec difference). This is what I expected on a non contended memory location such as the request status. We need to keep in mind that contention on that specific memory location can only happen once, because at best the status is updated once by the thread that did the progress.

@bosilca
Copy link
Member

bosilca commented Aug 19, 2021

After curating the data and looking more carefully at the diff between the 2 runs, there is indeed a trend toward a small positive impact in favor of this PR. The impact is 1 nano over about 170 on an M1, and around 1 nano over 300 on my Intel E5-2650 @ 2.3Ghz.

@awlauria
Copy link
Contributor Author

Thanks @bosilca

@awlauria awlauria merged commit a99747f into open-mpi:master Aug 19, 2021
@awlauria awlauria deleted the waitall_master branch August 19, 2021 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants