[libc++] Refactor the predicate taking variant of `__cxx_atomic_wait` #80596

jiixyj · 2024-02-04T13:37:54Z

This is a follow-up PR to #79265. It aims to be a gentle refactoring of the __cxx_atomic_wait function that takes a predicate.

The key idea here is that this function's signature is changed to look like this (std::function used just for clarity):

__cxx_atomic_wait_fn(Atp*, std::function<bool(Tp &)> poll, memory_order __order);

...where Tp is the corresponding value_type to the atomic variable type Atp. The function's semantics are similar to atomics .wait(), but instead of having a hardcoded predicate (is the loaded value unequal to old?) the predicate is specified explicitly.

The poll function may change its argument, and it is very important that if it returns false, it leaves its current understanding of the atomic's value in the argument. Internally, __cxx_atomic_wait_fn dispatches to two waiting mechanisms, depending on the type of the atomic variable:

If the atomic variable can be waited on directly (for example, Linux's futex mechanism only supports waiting on 32 bit long variables), the value of the atomic variable (which poll made its decision on) is then given to the underlying system wait function (e.g. futex).
If the atomic variable can not be waited on directly, there is a global pool of atomics that are used for this task. The "eventcount" pattern is employed to make this possible.

The eventcount pattern needs a "monitor" variable which is read before the condition is checked another time. libcxx has the __libcpp_atomic_monitor function for this. However, this function only has to be called in case "2", i.e. when the eventcount is actually used. In case "1", the futex is used directly, so the monitor must be the value of the atomic variable that the poll function made its decision on to continue blocking. Previously, __libcpp_atomic_monitor was also used in case "1". This was the source of the ABA style bug that PR#79265 fixed.

However, the solution in PR#79265 has some disadvantages:

It exposes internals such as cxx_contention_t or the fact that __libcpp_thread_poll_with_backoff needs two functions to higher level constructs such as semaphore.
It doesn't prevent consumers calling __cxx_atomic_wait in an error prone way, i.e. by providing to it a predicate that doesn't take an argument. This makes ABA style issues more likely to appear.

Now, __cxx_atomic_wait_fn takes just one function, which is then transformed into the poll and backoff callables needed by __libcpp_thread_poll_with_backoff.

Aside from the __cxx_atomic_wait changes, the only other change is the weakening of the initial atomic load of semaphore's try_acquire into memory_order_relaxed and the CAS inside the loop is changed from strong to weak. Both weakenings should be fine, since the CAS is called in a loop, and the "acquire" semantics of try_acquire come from the CAS, not from the initial load.

llvmbot · 2024-02-04T13:38:23Z

@llvm/pr-subscribers-libcxx

Author: Jan Kokemüller (jiixyj)

Changes

This is a follow-up PR to <#79265>. It aims to be a gentle refactoring of the __cxx_atomic_wait function that takes a predicate.

The key idea here is that this function's signature is changed to look like this (std::function used just for clarity):

__cxx_atomic_wait_fn(Atp*, std::function&lt;bool(Tp &amp;)&gt; poll, memory_order __order);

...where Tp is the corresponding value_type to the atomic variable type Atp. The function's semantics are similar to atomics .wait(), but instead of having a hardcoded predicate (is the loaded value unequal to old?) the predicate is specified explicitly.

The poll function may change its argument, and it is very important that if it returns false, it leaves its current understanding of the atomic's value in the argument. Internally, __cxx_atomic_wait_fn dispatches to two waiting mechanisms, depending on the type of the atomic variable:

If the atomic variable can be waited on directly (for example, Linux's futex mechanism only supports waiting on 32 bit long variables), the value of the atomic variable (which poll made its decision on) is then given to the underlying system wait function (e.g. futex).
If the atomic variable can not be waited on directly, there is a global pool of atomics that are used for this task. The "eventcount" pattern is employed to make this possible.

The eventcount pattern needs a "monitor" variable which is read before the condition is checked another time. libcxx has the __libcpp_atomic_monitor function for this. However, this function only has to be called in case "2", i.e. when the eventcount is actually used. In case "1", the futex is used directly, so the monitor must be the value of the atomic variable that the poll function made its decision on to continue blocking. Previously, __libcpp_atomic_monitor was also used in case "1". This was the source of the ABA style bug that PR#79265 fixed.

However, the solution in PR#79265 has some disadvantages:

It exposes internals such as cxx_contention_t or the fact that __libcpp_thread_poll_with_backoff needs two functions to higher level constructs such as semaphore.
It doesn't prevent consumers calling __cxx_atomic_wait in an error prone way, i.e. by providing to it a predicate that doesn't take an argument. This makes ABA style issues more likely to appear.

Now, __cxx_atomic_wait_fn takes just one function, which is then transformed into the poll and backoff callables needed by __libcpp_thread_poll_with_backoff.

Aside from the __cxx_atomic_wait changes, the only other change is the weakening of the initial atomic load of semaphore's try_acquire into memory_order_relaxed and the CAS inside the loop is changed from strong to weak. Both weakenings should be fine, since the CAS is called in a loop, and the "acquire" semantics of try_acquire come from the CAS, not from the initial load.

Full diff: https://github.com/llvm/llvm-project/pull/80596.diff

5 Files Affected:

(modified) libcxx/include/__atomic/atomic_sync.h (+69-22)
(modified) libcxx/include/latch (+9-2)
(modified) libcxx/include/semaphore (+14-15)
(modified) libcxx/src/atomic.cpp (+1)
(added) libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp (+64)

diff --git a/libcxx/include/__atomic/atomic_sync.h b/libcxx/include/__atomic/atomic_sync.h
index 93527958b2e1c..1eca545afb99f 100644
--- a/libcxx/include/__atomic/atomic_sync.h
+++ b/libcxx/include/__atomic/atomic_sync.h
@@ -43,17 +43,55 @@ __libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile*);
 _LIBCPP_AVAILABILITY_SYNC _LIBCPP_EXPORTED_FROM_ABI void
 __libcpp_atomic_wait(__cxx_atomic_contention_t const volatile*, __cxx_contention_t);
 
-template <class _Atp, class _Fn>
+template <class _Atp, class _Poll>
+struct __libcpp_atomic_wait_poll_impl {
+  _Atp* __a_;
+  _Poll __poll_;
+  memory_order __order_;
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool operator()() const {
+    auto __current_val = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__current_val);
+  }
+};
+
+template <class _Atp, class _Poll>
 struct __libcpp_atomic_wait_backoff_impl {
-  _Atp* __a;
-  _Fn __test_fn;
+  _Atp* __a_;
+  _Poll __poll_;
+  memory_order __order_;
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool
+  __poll_or_get_monitor(__cxx_atomic_contention_t const volatile*, __cxx_contention_t& __monitor) const {
+    // In case the atomic can be waited on directly, the monitor value is just
+    // the value of the atomic.
+    // `__poll_` takes the current value of the atomic as an in-out argument
+    // to potentially modify it. After it returns, `__monitor` has a value
+    // which can be safely waited on by `std::__libcpp_atomic_wait` without any
+    // ABA style issues.
+    __monitor = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__monitor);
+  }
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool __poll_or_get_monitor(void const volatile*, __cxx_contention_t& __monitor) const {
+    // In case we must wait on an atomic from the pool, the monitor comes from
+    // `std::__libcpp_atomic_monitor`.
+    // Only then we may read from `__a_`. This is the "event count" pattern.
+    __monitor          = std::__libcpp_atomic_monitor(__a_);
+    auto __current_val = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__current_val);
+  }
+
   _LIBCPP_AVAILABILITY_SYNC
   _LIBCPP_HIDE_FROM_ABI bool operator()(chrono::nanoseconds __elapsed) const {
     if (__elapsed > chrono::microseconds(64)) {
-      auto const __monitor = std::__libcpp_atomic_monitor(__a);
-      if (__test_fn())
+      __cxx_contention_t __monitor;
+      if (__poll_or_get_monitor(__a_, __monitor))
         return true;
-      std::__libcpp_atomic_wait(__a, __monitor);
+      std::__libcpp_atomic_wait(__a_, __monitor);
     } else if (__elapsed > chrono::microseconds(4))
       __libcpp_thread_yield();
     else {
@@ -62,10 +100,20 @@ struct __libcpp_atomic_wait_backoff_impl {
   }
 };
 
-template <class _Atp, class _Fn>
-_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp* __a, _Fn&& __test_fn) {
-  __libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_Fn> > __backoff_fn = {__a, __test_fn};
-  return std::__libcpp_thread_poll_with_backoff(__test_fn, __backoff_fn);
+// The semantics of this function are similar to `atomic`'s
+// `.wait(T old, std::memory_order order)`, but instead of having a hardcoded
+// predicate (is the loaded value unequal to `old`?), the predicate function is
+// specified as an argument. The loaded value is given as an in-out argument to
+// the predicate. If the predicate function returns `true`,
+// `_cxx_atomic_wait_fn` will return. If the predicate function returns
+// `false`, it must set the argument to its current understanding of the atomic
+// value. The predicate function must not return `false` spuriously.
+template <class _Atp, class _Poll>
+_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
+__cxx_atomic_wait_fn(_Atp* __a, _Poll&& __poll, memory_order __order) {
+  __libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn       = {__a, __poll, __order};
+  __libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_Poll> > __backoff_fn = {__a, __poll, __order};
+  (void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __backoff_fn);
 }
 
 #else // _LIBCPP_HAS_NO_THREADS
@@ -74,9 +122,10 @@ template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_impl<_Tp> const volatile*) {}
 template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_impl<_Tp> const volatile*) {}
-template <class _Atp, class _Fn>
-_LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp*, _Fn&& __test_fn) {
-  return std::__libcpp_thread_poll_with_backoff(__test_fn, __spinning_backoff_policy());
+template <class _Atp, class _Poll>
+_LIBCPP_HIDE_FROM_ABI void __cxx_atomic_wait_fn(_Atp*, _Poll&& __poll, memory_order __order) {
+  __libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn = {__a, __poll, __order};
+  (void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __spinning_backoff_policy());
 }
 
 #endif // _LIBCPP_HAS_NO_THREADS
@@ -86,21 +135,19 @@ _LIBCPP_HIDE_FROM_ABI bool __cxx_nonatomic_compare_equal(_Tp const& __lhs, _Tp c
   return std::memcmp(std::addressof(__lhs), std::addressof(__rhs), sizeof(_Tp)) == 0;
 }
 
-template <class _Atp, class _Tp>
-struct __cxx_atomic_wait_test_fn_impl {
-  _Atp* __a;
+template <class _Tp>
+struct __cxx_atomic_wait_poll_fn_impl {
   _Tp __val;
-  memory_order __order;
-  _LIBCPP_HIDE_FROM_ABI bool operator()() const {
-    return !std::__cxx_nonatomic_compare_equal(std::__cxx_atomic_load(__a, __order), __val);
+  _LIBCPP_HIDE_FROM_ABI bool operator()(_Tp& __current_val) const {
+    return !std::__cxx_nonatomic_compare_equal(__current_val, __val);
   }
 };
 
 template <class _Atp, class _Tp>
-_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
+_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
 __cxx_atomic_wait(_Atp* __a, _Tp const __val, memory_order __order) {
-  __cxx_atomic_wait_test_fn_impl<_Atp, _Tp> __test_fn = {__a, __val, __order};
-  return std::__cxx_atomic_wait(__a, __test_fn);
+  __cxx_atomic_wait_poll_fn_impl<_Tp> __poll_fn = {__val};
+  std::__cxx_atomic_wait_fn(__a, __poll_fn, __order);
 }
 
 _LIBCPP_END_NAMESPACE_STD
diff --git a/libcxx/include/latch b/libcxx/include/latch
index ad7b35579913b..45c1b5c91e6cd 100644
--- a/libcxx/include/latch
+++ b/libcxx/include/latch
@@ -97,9 +97,13 @@ public:
     if (__old == __update)
       __a_.notify_all();
   }
-  inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept { return 0 == __a_.load(memory_order_acquire); }
+  inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept {
+    auto __value = __a_.load(memory_order_acquire);
+    return try_wait_impl(__value);
+  }
   inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void wait() const {
-    __cxx_atomic_wait(&__a_.__a_, [this]() -> bool { return try_wait(); });
+    __cxx_atomic_wait_fn(
+        &__a_.__a_, [this](ptrdiff_t& __value) -> bool { return try_wait_impl(__value); }, memory_order_acquire);
   }
   inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void arrive_and_wait(ptrdiff_t __update = 1) {
     _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(__update >= 0, "latch::arrive_and_wait called with a negative value");
@@ -108,6 +112,9 @@ public:
     count_down(__update);
     wait();
   }
+
+private:
+  inline _LIBCPP_HIDE_FROM_ABI bool try_wait_impl(ptrdiff_t& __value) const noexcept { return __value == 0; }
 };
 
 _LIBCPP_END_NAMESPACE_STD
diff --git a/libcxx/include/semaphore b/libcxx/include/semaphore
index 649705f45b049..27e5951e7d982 100644
--- a/libcxx/include/semaphore
+++ b/libcxx/include/semaphore
@@ -54,6 +54,7 @@ using binary_semaphore = counting_semaphore<1>;
 #include <__assert> // all public C++ headers provide the assertion handler
 #include <__atomic/atomic_base.h>
 #include <__atomic/atomic_sync.h>
+#include <__atomic/contention_t.h>
 #include <__atomic/memory_order.h>
 #include <__availability>
 #include <__chrono/time_point.h>
@@ -94,35 +95,33 @@ public:
     auto __old = __a_.fetch_add(__update, memory_order_release);
     _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(
         __update <= _LIBCPP_SEMAPHORE_MAX - __old, "update is greater than the expected value");
-
-    if (__old > 0) {
-      // Nothing to do
-    } else if (__update > 1)
+    if (__old == 0) {
       __a_.notify_all();
-    else
-      __a_.notify_one();
+    }
   }
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void acquire() {
-    auto const __test_fn = [this]() -> bool {
-      auto __old = __a_.load(memory_order_relaxed);
-      return (__old != 0) && __a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
-    };
-    __cxx_atomic_wait(&__a_.__a_, __test_fn);
+    __cxx_atomic_wait_fn(
+        &__a_.__a_, [this](ptrdiff_t& __old) { return __try_acquire_impl(__old); }, memory_order_relaxed);
   }
   template <class _Rep, class _Period>
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
   try_acquire_for(chrono::duration<_Rep, _Period> const& __rel_time) {
     if (__rel_time == chrono::duration<_Rep, _Period>::zero())
       return try_acquire();
-    auto const __test_fn = [this]() { return try_acquire(); };
-    return std::__libcpp_thread_poll_with_backoff(__test_fn, __libcpp_timed_backoff_policy(), __rel_time);
+    auto const __poll_fn = [this]() { return try_acquire(); };
+    return std::__libcpp_thread_poll_with_backoff(__poll_fn, __libcpp_timed_backoff_policy(), __rel_time);
   }
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool try_acquire() {
-    auto __old = __a_.load(memory_order_acquire);
+    auto __old = __a_.load(memory_order_relaxed);
+    return __try_acquire_impl(__old);
+  }
+
+private:
+  _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __try_acquire_impl(ptrdiff_t& __old) {
     while (true) {
       if (__old == 0)
         return false;
-      if (__a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
+      if (__a_.compare_exchange_weak(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
         return true;
     }
   }
diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp
index 2f0389ae6974a..2b67685c8a0a1 100644
--- a/libcxx/src/atomic.cpp
+++ b/libcxx/src/atomic.cpp
@@ -178,6 +178,7 @@ _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_contention_t
 _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_contention_t const volatile* __location) {
   __libcpp_contention_notify(&__libcpp_contention_state(__location)->__contention_state, __location, false);
 }
+// This function is never used, but still exported for ABI compatibility.
 _LIBCPP_EXPORTED_FROM_ABI __cxx_contention_t
 __libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile* __location) {
   return __libcpp_contention_monitor_for_wait(&__libcpp_contention_state(__location)->__contention_state, __location);
diff --git a/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp b/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp
new file mode 100644
index 0000000000000..dca3c01cf0103
--- /dev/null
+++ b/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp
@@ -0,0 +1,64 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: no-threads
+// UNSUPPORTED: c++03, c++11, c++14, c++17
+
+// XFAIL: availability-synchronization_library-missing
+
+// This is a regression test for https://llvm.org/PR47013.
+
+// <semaphore>
+
+#include <barrier>
+#include <semaphore>
+#include <thread>
+#include <vector>
+
+#include "make_test_thread.h"
+
+static std::counting_semaphore<> s(0);
+static std::barrier<> b(8 + 1);
+
+void acquire() {
+  for (int i = 0; i < 10'000; ++i) {
+    s.acquire();
+    b.arrive_and_wait();
+  }
+}
+
+void release() {
+  for (int i = 0; i < 10'000; ++i) {
+    s.release(1);
+    s.release(1);
+    s.release(1);
+    s.release(1);
+
+    s.release(1);
+    s.release(1);
+    s.release(1);
+    s.release(1);
+
+    b.arrive_and_wait();
+  }
+}
+
+int main(int, char**) {
+  for (int run = 0; run < 20; ++run) {
+    std::vector<std::thread> threads;
+    for (int i = 0; i < 8; ++i)
+      threads.push_back(support::make_test_thread(acquire));
+
+    threads.push_back(support::make_test_thread(release));
+
+    for (auto& thread : threads)
+      thread.join();
+  }
+
+  return 0;
+}

jiixyj · 2024-02-04T13:50:10Z

As this is based on #79265 , which is not yet merged, I converted this into a draft PR.

ldionne

Thanks for the patch and sorry it took so long to get to it, the last few weeks have been quite busy. I appreciate that you spent time figuring out how to refactor this code and make it less prone to bugs -- it's important since this code is really complex.

I generally like the approach. I have a few comments and questions but I think we could move forward with this, as it does make the code better and also opens the door to more refactoring like #81427.

I must say I kinda liked the approach suggested by @huixie90 in #79265 where he refactored latch and barrier to use atomic::wait directly, but I guess we can move forward with this patch and then do that other approach in case we end up getting consensus to do it.

I would like @huixie90 to stamp this patch before I give it a last LGTM.

ldionne · 2024-02-16T14:23:14Z

libcxx/include/__atomic/atomic_sync.h

+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool
+  __poll_or_get_monitor(__cxx_atomic_contention_t const volatile*, __cxx_contention_t& __monitor) const {


IIUC, this is roughly what @huixie90 is transforming into a CPO in #81427 .

Yes, the two __poll_or_get_monitor functions here are basically the same thing as the two __update_monitor_val_and_poll functions from #81427 (here).

ldionne · 2024-02-16T14:25:10Z

libcxx/include/__atomic/atomic_sync.h

-      __a, {__poll}};
-  return std::__libcpp_thread_poll_with_backoff(__poll, __backoff_fn);
+_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
+__cxx_atomic_wait_fn(_Atp* __a, _Poll&& __poll, memory_order __order) {


We often use _fn to denote things that are implementation details or function objects in the library. IMO, keeping this function named __cxx_atomic_wait would be better.

We often use _fn to denote things that are implementation details or function objects in the library.

Thanks, that is good to know!

IMO, keeping this function named __cxx_atomic_wait would be better.

I agree. It would conflict with the "normal" __cxx_atomic_wait though, as the _Poll type parameter is unconstrained. Instead of doing some SFINAE hacks I renamed it to __cxx_atomic_wait_unless in anticipation of #81427.

ldionne · 2024-02-16T14:25:39Z

libcxx/include/__atomic/atomic_sync.h

-struct __cxx_atomic_wait_test_fn_impl {
-  _Atp* __a;
+template <class _Tp>
+struct __cxx_atomic_wait_poll_fn_impl {


I would suggest renaming this to __atomic_compare_equal_to or something like that -- that's still not a great name but it is a bit more descriptive, and in the future we could probably write this as a lambda (which is what would really make this code easier to read).

Yes, a lambda would definitely help. I renamed this struct to __atomic_compare_unequal_to. So the logic can be read as "wait unless/until the atomic variable compares unequal to val" (similar to the wording in the spec: https://eel.is/c++draft/atomics#ref.ops-23.2).

ldionne · 2024-02-16T14:32:53Z

libcxx/src/atomic.cpp

@@ -178,6 +178,7 @@ _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_contention_t
 _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_contention_t const volatile* __location) {
  __libcpp_contention_notify(&__libcpp_contention_state(__location)->__contention_state, __location, false);
 }
+// This function is never used, but still exported for ABI compatibility.


Is this because this is handled by __poll_or_get_monitor now?

Yes, exactly. This overload of __libcpp_atomic_monitor was used to get the "monitor" value in case the ptrdiff_t of the semaphore could be waited on directly. And this was the reason for the original ABA bug, where this could happen:

__libcpp_atomic_monitor returns "1" (i.e. the semaphore has the value 1)

another thread swoops in, takes the "1" down to "0"

"our" poll function returns "false", because there is nothing left to try_acquire

another thread releases the semaphore (increases the value from "0" to "1")

atomic_wait now waits on the "1", resulting in a deadlock.

And now, __poll_or_get_monitor makes sure that in this case, the "monitor" value is always "0" (because return __poll_(__monitor); sets __monitor to "0" if it returns false).

huixie90 · 2024-02-16T15:52:19Z

LGTM with comments from Louis resolved.

I could continue the discussion on whether to refactor the code to use public wait API after this lands.
also just FYI, I have further refactoring to enable these functions reusable by atomic_ref (which build on top of this patch)
#81427

ldionne

Since I am going OOO for a week and I don't want to block this on my approval, please go ahead with the merge once you feel like my comments/questions have been resolved/answered. Don't wait for me to stamp again.

Thanks for the patch, again.

jiixyj · 2024-02-18T20:32:15Z

Thank you @huixie90 and @ldionne for the review! From my side this is good to go.

#81427) The goal of this patch is to make `atomic`'s wait functions to be reusable by `atomic_ref`. #76647 First, this patch is built on top of #80596 , to reduce the future merge conflicts. This patch made the following functions as "API"s to be used by `atomic`, `atomic_flag`, `semaphore`, `latch`, `atomic_ref` ``` __atomic_wait __atomic_wait_unless __atomic_notify_one __atomic_notify_all ``` These functions are made generic to support `atomic` type and `atomic_ref`. There are two customisation points. ``` // How to load the value from the given type (with a memory order) __atomic_load ``` ``` // what is the contention address that the platform `wait` function is going to monitor __atomic_contention_address ``` For `atomic_ref` (not implemented in this patch), the `load` and `address` function will be different, because - it does not use the "atomic abstraction layer" so the `load` operation will be some gcc builtin - the contention address will be the user's actual type that the `atomic_ref` is pointing to

…f (#81427) The goal of this patch is to make `atomic`'s wait functions to be reusable by `atomic_ref`. llvm/llvm-project#76647 First, this patch is built on top of llvm/llvm-project#80596 , to reduce the future merge conflicts. This patch made the following functions as "API"s to be used by `atomic`, `atomic_flag`, `semaphore`, `latch`, `atomic_ref` ``` __atomic_wait __atomic_wait_unless __atomic_notify_one __atomic_notify_all ``` These functions are made generic to support `atomic` type and `atomic_ref`. There are two customisation points. ``` // How to load the value from the given type (with a memory order) __atomic_load ``` ``` // what is the contention address that the platform `wait` function is going to monitor __atomic_contention_address ``` For `atomic_ref` (not implemented in this patch), the `load` and `address` function will be different, because - it does not use the "atomic abstraction layer" so the `load` operation will be some gcc builtin - the contention address will be the user's actual type that the `atomic_ref` is pointing to NOKEYCHECK=True GitOrigin-RevId: 1f613bce19ea78789934b2a47be8c6a13925f0fa

jiixyj requested a review from a team as a code owner February 4, 2024 13:37

llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 4, 2024

jiixyj mentioned this pull request Feb 4, 2024

[libc++] fix counting_semaphore lost wakeups #79265

Merged

jiixyj marked this pull request as draft February 4, 2024 13:49

jiixyj force-pushed the atomic-wait-fixes-2 branch from 63c7813 to aced309 Compare February 5, 2024 16:57

jiixyj marked this pull request as ready for review February 5, 2024 17:00

huixie90 mentioned this pull request Feb 11, 2024

[libc++] refactor cxx_atomic_wait to make it reusable for atomic_ref #81427

Merged

ldionne assigned ldionne and huixie90 Feb 16, 2024

ldionne reviewed Feb 16, 2024

View reviewed changes

ldionne approved these changes Feb 16, 2024

View reviewed changes

jiixyj added 5 commits February 18, 2024 11:34

refactor the predicate taking variant of '__cxx_atomic_wait'

8e34214

remove unneeded include of '__atomic/contention_t.h'

ce27d09

try to fix CI error

6f39049

CI

0866001

CI

5541162

jiixyj force-pushed the atomic-wait-fixes-2 branch from aced309 to 4ef3415 Compare February 18, 2024 11:26

rename '_cxx_atomic_wait_fn' to '__cxx_atomic_wait_unless'

f151e17

jiixyj force-pushed the atomic-wait-fixes-2 branch from 4ef3415 to f151e17 Compare February 18, 2024 11:41

rename '__cxx_atomic_wait_poll_fn_impl' to '__atomic_compare_unequal_to'

2daeb53

jiixyj force-pushed the atomic-wait-fixes-2 branch from 9a3f43e to 2daeb53 Compare February 18, 2024 11:55

huixie90 merged commit 95ebf2b into llvm:main Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libc++] Refactor the predicate taking variant of `__cxx_atomic_wait` #80596

[libc++] Refactor the predicate taking variant of `__cxx_atomic_wait` #80596

Uh oh!

jiixyj commented Feb 4, 2024

Uh oh!

llvmbot commented Feb 4, 2024

Uh oh!

jiixyj commented Feb 4, 2024

Uh oh!

ldionne left a comment

Uh oh!

ldionne Feb 16, 2024

Uh oh!

jiixyj Feb 18, 2024

Uh oh!

ldionne Feb 16, 2024

Uh oh!

jiixyj Feb 18, 2024 •

edited

Loading

Uh oh!

ldionne Feb 16, 2024

Uh oh!

jiixyj Feb 18, 2024

Uh oh!

ldionne Feb 16, 2024

Uh oh!

jiixyj Feb 18, 2024

Uh oh!

jiixyj Feb 18, 2024

Uh oh!

huixie90 commented Feb 16, 2024

Uh oh!

ldionne left a comment

Uh oh!

jiixyj commented Feb 18, 2024

Uh oh!

Uh oh!

[libc++] Refactor the predicate taking variant of __cxx_atomic_wait #80596

[libc++] Refactor the predicate taking variant of __cxx_atomic_wait #80596

Uh oh!

Conversation

jiixyj commented Feb 4, 2024

Uh oh!

llvmbot commented Feb 4, 2024

Uh oh!

jiixyj commented Feb 4, 2024

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jiixyj Feb 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huixie90 commented Feb 16, 2024

Uh oh!

ldionne left a comment

Choose a reason for hiding this comment

Uh oh!

jiixyj commented Feb 18, 2024

Uh oh!

Uh oh!

[libc++] Refactor the predicate taking variant of `__cxx_atomic_wait` #80596

[libc++] Refactor the predicate taking variant of `__cxx_atomic_wait` #80596

jiixyj Feb 18, 2024 •

edited

Loading