Skip to content

xl: Use C11 builtin atomics if available. #8528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

awlauria
Copy link
Contributor

If unavailable, fallback to gcc builtins.
Some configury work was needed to force include
stdatomic.h. Currently xl doesn't have it in its search
path.

Tested with xl V16.1.1.

Signed-off-by: Austen Lauria [email protected]

@ibm-ompi
Copy link

The IBM CI (XL) build failed! Please review the log, linked below.

Gist: https://gist.github.com/87387a129d5df1b5818dd106b4baaf5c

@awlauria awlauria force-pushed the remove_ppc_atomics_master branch 4 times, most recently from c8c14b9 to 9b2ef1b Compare February 25, 2021 23:05
@awlauria
Copy link
Contributor Author

bot:aws:retest

@awlauria awlauria requested a review from markalle February 26, 2021 15:55
If unavailable, fallback to gcc builtins.
Some configury work was needed to force include
stdatomic.h. Currently xl doesn't have it in its search
path.

Tested with xl V16.1.1.

Signed-off-by: Austen Lauria <[email protected]>
@awlauria awlauria force-pushed the remove_ppc_atomics_master branch from 9b2ef1b to 9dd2ce5 Compare March 1, 2021 16:10
@awlauria
Copy link
Contributor Author

awlauria commented Mar 1, 2021

I do see a fair sized regression with opal_lifo/opal_fifo make check tests from the current master in-line assembly xl atomics on power9. So adding the WIP label. It would appear that the c11 atomics via xl are not the way to go performance wise.

What I can do is allow xl + c11 if explicitly requested in configure, otherwise use the powerpc atomic headers by default.

There are xl builtins that may perform just as well/better than the assembly, I'll see if there's any merit to working those in.

@hjelmn
Copy link
Member

hjelmn commented Mar 1, 2021

For better performance for the lifo/fifo split out the LL/SC atomics as done for AArch64. See #8412

@hjelmn
Copy link
Member

hjelmn commented Mar 1, 2021

CAS is not a native concept on Power or AArch64. It can be implemented on top of LL/SC but the the CAS128 implementation of the lifo/fifo used in Open MPI are for avoiding ABA problems. LL/SC on its own is sufficient to avoid ABA. I did one more test with CAS128 with a CPU release (such as the one used in the LL/SC implementation) and even there the pure LL/SC lifo/fifo beat the CAS128 implementation by more than a factor of 2.

@awlauria
Copy link
Contributor Author

awlauria commented Mar 3, 2021

Thanks for the tips @hjelmn - I will be working on making those changes today.

@awlauria
Copy link
Contributor Author

Closing for #8649

@awlauria awlauria closed this Mar 22, 2021
@awlauria awlauria deleted the remove_ppc_atomics_master branch March 17, 2022 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants