Skip to content

Commit 95ebf2b

Browse files
authored
[libc++] Refactor the predicate taking variant of __cxx_atomic_wait (#80596)
This is a follow-up PR to <#79265>. It aims to be a gentle refactoring of the `__cxx_atomic_wait` function that takes a predicate. The key idea here is that this function's signature is changed to look like this (`std::function` used just for clarity): ```c++ __cxx_atomic_wait_fn(Atp*, std::function<bool(Tp &)> poll, memory_order __order); ``` ...where `Tp` is the corresponding `value_type` to the atomic variable type `Atp`. The function's semantics are similar to `atomic`s `.wait()`, but instead of having a hardcoded predicate (is the loaded value unequal to `old`?) the predicate is specified explicitly. The `poll` function may change its argument, and it is very important that if it returns `false`, it leaves its current understanding of the atomic's value in the argument. Internally, `__cxx_atomic_wait_fn` dispatches to two waiting mechanisms, depending on the type of the atomic variable: 1. If the atomic variable can be waited on directly (for example, Linux's futex mechanism only supports waiting on 32 bit long variables), the value of the atomic variable (which `poll` made its decision on) is then given to the underlying system wait function (e.g. futex). 2. If the atomic variable can not be waited on directly, there is a global pool of atomics that are used for this task. The ["eventcount" pattern](<https://gist.github.com/mratsim/04a29bdd98d6295acda4d0677c4d0041>) is employed to make this possible. The eventcount pattern needs a "monitor" variable which is read before the condition is checked another time. libcxx has the `__libcpp_atomic_monitor` function for this. However, this function only has to be called in case "2", i.e. when the eventcount is actually used. In case "1", the futex is used directly, so the monitor must be the value of the atomic variable that the `poll` function made its decision on to continue blocking. Previously, `__libcpp_atomic_monitor` was _also_ used in case "1". This was the source of the ABA style bug that PR#79265 fixed. However, the solution in PR#79265 has some disadvantages: - It exposes internals such as `cxx_contention_t` or the fact that `__libcpp_thread_poll_with_backoff` needs two functions to higher level constructs such as `semaphore`. - It doesn't prevent consumers calling `__cxx_atomic_wait` in an error prone way, i.e. by providing to it a predicate that doesn't take an argument. This makes ABA style issues more likely to appear. Now, `__cxx_atomic_wait_fn` takes just _one_ function, which is then transformed into the `poll` and `backoff` callables needed by `__libcpp_thread_poll_with_backoff`. Aside from the `__cxx_atomic_wait` changes, the only other change is the weakening of the initial atomic load of `semaphore`'s `try_acquire` into `memory_order_relaxed` and the CAS inside the loop is changed from `strong` to `weak`. Both weakenings should be fine, since the CAS is called in a loop, and the "acquire" semantics of `try_acquire` come from the CAS, not from the initial load.
1 parent 3f0404a commit 95ebf2b

File tree

4 files changed

+80
-51
lines changed

4 files changed

+80
-51
lines changed

libcxx/include/__atomic/atomic_sync.h

+66-35
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,19 @@
2727

2828
_LIBCPP_BEGIN_NAMESPACE_STD
2929

30+
template <class _Atp, class _Poll>
31+
struct __libcpp_atomic_wait_poll_impl {
32+
_Atp* __a_;
33+
_Poll __poll_;
34+
memory_order __order_;
35+
36+
_LIBCPP_AVAILABILITY_SYNC
37+
_LIBCPP_HIDE_FROM_ABI bool operator()() const {
38+
auto __current_val = std::__cxx_atomic_load(__a_, __order_);
39+
return __poll_(__current_val);
40+
}
41+
};
42+
3043
#ifndef _LIBCPP_HAS_NO_THREADS
3144

3245
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(void const volatile*);
@@ -43,15 +56,40 @@ __libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile*);
4356
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_EXPORTED_FROM_ABI void
4457
__libcpp_atomic_wait(__cxx_atomic_contention_t const volatile*, __cxx_contention_t);
4558

46-
template <class _Atp, class _BackoffTest>
59+
template <class _Atp, class _Poll>
4760
struct __libcpp_atomic_wait_backoff_impl {
4861
_Atp* __a_;
49-
_BackoffTest __backoff_test_;
62+
_Poll __poll_;
63+
memory_order __order_;
64+
65+
_LIBCPP_AVAILABILITY_SYNC
66+
_LIBCPP_HIDE_FROM_ABI bool
67+
__poll_or_get_monitor(__cxx_atomic_contention_t const volatile*, __cxx_contention_t& __monitor) const {
68+
// In case the atomic can be waited on directly, the monitor value is just
69+
// the value of the atomic.
70+
// `__poll_` takes the current value of the atomic as an in-out argument
71+
// to potentially modify it. After it returns, `__monitor` has a value
72+
// which can be safely waited on by `std::__libcpp_atomic_wait` without any
73+
// ABA style issues.
74+
__monitor = std::__cxx_atomic_load(__a_, __order_);
75+
return __poll_(__monitor);
76+
}
77+
78+
_LIBCPP_AVAILABILITY_SYNC
79+
_LIBCPP_HIDE_FROM_ABI bool __poll_or_get_monitor(void const volatile*, __cxx_contention_t& __monitor) const {
80+
// In case we must wait on an atomic from the pool, the monitor comes from
81+
// `std::__libcpp_atomic_monitor`.
82+
// Only then we may read from `__a_`. This is the "event count" pattern.
83+
__monitor = std::__libcpp_atomic_monitor(__a_);
84+
auto __current_val = std::__cxx_atomic_load(__a_, __order_);
85+
return __poll_(__current_val);
86+
}
87+
5088
_LIBCPP_AVAILABILITY_SYNC
5189
_LIBCPP_HIDE_FROM_ABI bool operator()(chrono::nanoseconds __elapsed) const {
5290
if (__elapsed > chrono::microseconds(64)) {
53-
auto __monitor = std::__libcpp_atomic_monitor(__a_);
54-
if (__backoff_test_(__monitor))
91+
__cxx_contention_t __monitor;
92+
if (__poll_or_get_monitor(__a_, __monitor))
5593
return true;
5694
std::__libcpp_atomic_wait(__a_, __monitor);
5795
} else if (__elapsed > chrono::microseconds(4))
@@ -62,26 +100,20 @@ struct __libcpp_atomic_wait_backoff_impl {
62100
}
63101
};
64102

65-
template <class _Atp, class _Poll, class _BackoffTest>
66-
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
67-
__cxx_atomic_wait(_Atp* __a, _Poll&& __poll, _BackoffTest&& __backoff_test) {
68-
__libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_BackoffTest> > __backoff_fn = {__a, __backoff_test};
69-
return std::__libcpp_thread_poll_with_backoff(__poll, __backoff_fn);
70-
}
71-
72-
template <class _Poll>
73-
struct __libcpp_atomic_wait_poll_as_backoff_test {
74-
_Poll __poll_;
75-
76-
_LIBCPP_AVAILABILITY_SYNC
77-
_LIBCPP_HIDE_FROM_ABI bool operator()(__cxx_contention_t&) const { return __poll_(); }
78-
};
79-
103+
// The semantics of this function are similar to `atomic`'s
104+
// `.wait(T old, std::memory_order order)`, but instead of having a hardcoded
105+
// predicate (is the loaded value unequal to `old`?), the predicate function is
106+
// specified as an argument. The loaded value is given as an in-out argument to
107+
// the predicate. If the predicate function returns `true`,
108+
// `_cxx_atomic_wait_unless` will return. If the predicate function returns
109+
// `false`, it must set the argument to its current understanding of the atomic
110+
// value. The predicate function must not return `false` spuriously.
80111
template <class _Atp, class _Poll>
81-
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp* __a, _Poll&& __poll) {
82-
__libcpp_atomic_wait_backoff_impl<_Atp, __libcpp_atomic_wait_poll_as_backoff_test<_Poll&> > __backoff_fn = {
83-
__a, {__poll}};
84-
return std::__libcpp_thread_poll_with_backoff(__poll, __backoff_fn);
112+
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
113+
__cxx_atomic_wait_unless(_Atp* __a, _Poll&& __poll, memory_order __order) {
114+
__libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn = {__a, __poll, __order};
115+
__libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_Poll> > __backoff_fn = {__a, __poll, __order};
116+
(void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __backoff_fn);
85117
}
86118

87119
#else // _LIBCPP_HAS_NO_THREADS
@@ -90,9 +122,10 @@ template <class _Tp>
90122
_LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_impl<_Tp> const volatile*) {}
91123
template <class _Tp>
92124
_LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_impl<_Tp> const volatile*) {}
93-
template <class _Atp, class _Fn>
94-
_LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp*, _Fn&& __test_fn) {
95-
return std::__libcpp_thread_poll_with_backoff(__test_fn, __spinning_backoff_policy());
125+
template <class _Atp, class _Poll>
126+
_LIBCPP_HIDE_FROM_ABI void __cxx_atomic_wait_unless(_Atp* __a, _Poll&& __poll, memory_order __order) {
127+
__libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn = {__a, __poll, __order};
128+
(void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __spinning_backoff_policy());
96129
}
97130

98131
#endif // _LIBCPP_HAS_NO_THREADS
@@ -102,21 +135,19 @@ _LIBCPP_HIDE_FROM_ABI bool __cxx_nonatomic_compare_equal(_Tp const& __lhs, _Tp c
102135
return std::memcmp(std::addressof(__lhs), std::addressof(__rhs), sizeof(_Tp)) == 0;
103136
}
104137

105-
template <class _Atp, class _Tp>
106-
struct __cxx_atomic_wait_test_fn_impl {
107-
_Atp* __a;
138+
template <class _Tp>
139+
struct __atomic_compare_unequal_to {
108140
_Tp __val;
109-
memory_order __order;
110-
_LIBCPP_HIDE_FROM_ABI bool operator()() const {
111-
return !std::__cxx_nonatomic_compare_equal(std::__cxx_atomic_load(__a, __order), __val);
141+
_LIBCPP_HIDE_FROM_ABI bool operator()(_Tp& __current_val) const {
142+
return !std::__cxx_nonatomic_compare_equal(__current_val, __val);
112143
}
113144
};
114145

115146
template <class _Atp, class _Tp>
116-
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
147+
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
117148
__cxx_atomic_wait(_Atp* __a, _Tp const __val, memory_order __order) {
118-
__cxx_atomic_wait_test_fn_impl<_Atp, _Tp> __test_fn = {__a, __val, __order};
119-
return std::__cxx_atomic_wait(__a, __test_fn);
149+
__atomic_compare_unequal_to<_Tp> __poll_fn = {__val};
150+
std::__cxx_atomic_wait_unless(__a, __poll_fn, __order);
120151
}
121152

122153
_LIBCPP_END_NAMESPACE_STD

libcxx/include/latch

+9-2
Original file line numberDiff line numberDiff line change
@@ -97,9 +97,13 @@ public:
9797
if (__old == __update)
9898
__a_.notify_all();
9999
}
100-
inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept { return 0 == __a_.load(memory_order_acquire); }
100+
inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept {
101+
auto __value = __a_.load(memory_order_acquire);
102+
return try_wait_impl(__value);
103+
}
101104
inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void wait() const {
102-
__cxx_atomic_wait(&__a_.__a_, [this]() -> bool { return try_wait(); });
105+
__cxx_atomic_wait_unless(
106+
&__a_.__a_, [this](ptrdiff_t& __value) -> bool { return try_wait_impl(__value); }, memory_order_acquire);
103107
}
104108
inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void arrive_and_wait(ptrdiff_t __update = 1) {
105109
_LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(__update >= 0, "latch::arrive_and_wait called with a negative value");
@@ -108,6 +112,9 @@ public:
108112
count_down(__update);
109113
wait();
110114
}
115+
116+
private:
117+
inline _LIBCPP_HIDE_FROM_ABI bool try_wait_impl(ptrdiff_t& __value) const noexcept { return __value == 0; }
111118
};
112119

113120
_LIBCPP_END_NAMESPACE_STD

libcxx/include/semaphore

+4-14
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,6 @@ using binary_semaphore = counting_semaphore<1>;
5454
#include <__assert> // all public C++ headers provide the assertion handler
5555
#include <__atomic/atomic_base.h>
5656
#include <__atomic/atomic_sync.h>
57-
#include <__atomic/contention_t.h>
5857
#include <__atomic/memory_order.h>
5958
#include <__availability>
6059
#include <__chrono/time_point.h>
@@ -100,17 +99,8 @@ public:
10099
}
101100
}
102101
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void acquire() {
103-
auto const __poll_fn = [this]() -> bool {
104-
auto __old = __a_.load(memory_order_relaxed);
105-
return (__old != 0) && __a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
106-
};
107-
auto const __backoff_test = [this](__cxx_contention_t& __monitor) -> bool {
108-
ptrdiff_t __old = __monitor;
109-
bool __r = __try_acquire_impl(__old);
110-
__monitor = static_cast<__cxx_contention_t>(__old);
111-
return __r;
112-
};
113-
__cxx_atomic_wait(&__a_.__a_, __poll_fn, __backoff_test);
102+
__cxx_atomic_wait_unless(
103+
&__a_.__a_, [this](ptrdiff_t& __old) { return __try_acquire_impl(__old); }, memory_order_relaxed);
114104
}
115105
template <class _Rep, class _Period>
116106
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
@@ -121,7 +111,7 @@ public:
121111
return std::__libcpp_thread_poll_with_backoff(__poll_fn, __libcpp_timed_backoff_policy(), __rel_time);
122112
}
123113
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool try_acquire() {
124-
auto __old = __a_.load(memory_order_acquire);
114+
auto __old = __a_.load(memory_order_relaxed);
125115
return __try_acquire_impl(__old);
126116
}
127117

@@ -130,7 +120,7 @@ private:
130120
while (true) {
131121
if (__old == 0)
132122
return false;
133-
if (__a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
123+
if (__a_.compare_exchange_weak(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
134124
return true;
135125
}
136126
}

libcxx/src/atomic.cpp

+1
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,7 @@ _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_contention_t
178178
_LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_contention_t const volatile* __location) {
179179
__libcpp_contention_notify(&__libcpp_contention_state(__location)->__contention_state, __location, false);
180180
}
181+
// This function is never used, but still exported for ABI compatibility.
181182
_LIBCPP_EXPORTED_FROM_ABI __cxx_contention_t
182183
__libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile* __location) {
183184
return __libcpp_contention_monitor_for_wait(&__libcpp_contention_state(__location)->__contention_state, __location);

0 commit comments

Comments
 (0)