Skip to content

[libc++] Refactor the predicate taking variant of __cxx_atomic_wait #80596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 19, 2024

Conversation

jiixyj
Copy link
Contributor

@jiixyj jiixyj commented Feb 4, 2024

This is a follow-up PR to #79265. It aims to be a gentle refactoring of the __cxx_atomic_wait function that takes a predicate.

The key idea here is that this function's signature is changed to look like this (std::function used just for clarity):

__cxx_atomic_wait_fn(Atp*, std::function<bool(Tp &)> poll, memory_order __order);

...where Tp is the corresponding value_type to the atomic variable type Atp. The function's semantics are similar to atomics .wait(), but instead of having a hardcoded predicate (is the loaded value unequal to old?) the predicate is specified explicitly.

The poll function may change its argument, and it is very important that if it returns false, it leaves its current understanding of the atomic's value in the argument. Internally, __cxx_atomic_wait_fn dispatches to two waiting mechanisms, depending on the type of the atomic variable:

  1. If the atomic variable can be waited on directly (for example, Linux's futex mechanism only supports waiting on 32 bit long variables), the value of the atomic variable (which poll made its decision on) is then given to the underlying system wait function (e.g. futex).
  2. If the atomic variable can not be waited on directly, there is a global pool of atomics that are used for this task. The "eventcount" pattern is employed to make this possible.

The eventcount pattern needs a "monitor" variable which is read before the condition is checked another time. libcxx has the __libcpp_atomic_monitor function for this. However, this function only has to be called in case "2", i.e. when the eventcount is actually used. In case "1", the futex is used directly, so the monitor must be the value of the atomic variable that the poll function made its decision on to continue blocking. Previously, __libcpp_atomic_monitor was also used in case "1". This was the source of the ABA style bug that PR#79265 fixed.

However, the solution in PR#79265 has some disadvantages:

  • It exposes internals such as cxx_contention_t or the fact that __libcpp_thread_poll_with_backoff needs two functions to higher level constructs such as semaphore.
  • It doesn't prevent consumers calling __cxx_atomic_wait in an error prone way, i.e. by providing to it a predicate that doesn't take an argument. This makes ABA style issues more likely to appear.

Now, __cxx_atomic_wait_fn takes just one function, which is then transformed into the poll and backoff callables needed by __libcpp_thread_poll_with_backoff.

Aside from the __cxx_atomic_wait changes, the only other change is the weakening of the initial atomic load of semaphore's try_acquire into memory_order_relaxed and the CAS inside the loop is changed from strong to weak. Both weakenings should be fine, since the CAS is called in a loop, and the "acquire" semantics of try_acquire come from the CAS, not from the initial load.

@jiixyj jiixyj requested a review from a team as a code owner February 4, 2024 13:37
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 4, 2024

@llvm/pr-subscribers-libcxx

Author: Jan Kokemüller (jiixyj)

Changes

This is a follow-up PR to <#79265>. It aims to be a gentle refactoring of the __cxx_atomic_wait function that takes a predicate.

The key idea here is that this function's signature is changed to look like this (std::function used just for clarity):

__cxx_atomic_wait_fn(Atp*, std::function&lt;bool(Tp &amp;)&gt; poll, memory_order __order);

...where Tp is the corresponding value_type to the atomic variable type Atp. The function's semantics are similar to atomics .wait(), but instead of having a hardcoded predicate (is the loaded value unequal to old?) the predicate is specified explicitly.

The poll function may change its argument, and it is very important that if it returns false, it leaves its current understanding of the atomic's value in the argument. Internally, __cxx_atomic_wait_fn dispatches to two waiting mechanisms, depending on the type of the atomic variable:

  1. If the atomic variable can be waited on directly (for example, Linux's futex mechanism only supports waiting on 32 bit long variables), the value of the atomic variable (which poll made its decision on) is then given to the underlying system wait function (e.g. futex).
  2. If the atomic variable can not be waited on directly, there is a global pool of atomics that are used for this task. The "eventcount" pattern is employed to make this possible.

The eventcount pattern needs a "monitor" variable which is read before the condition is checked another time. libcxx has the __libcpp_atomic_monitor function for this. However, this function only has to be called in case "2", i.e. when the eventcount is actually used. In case "1", the futex is used directly, so the monitor must be the value of the atomic variable that the poll function made its decision on to continue blocking. Previously, __libcpp_atomic_monitor was also used in case "1". This was the source of the ABA style bug that PR#79265 fixed.

However, the solution in PR#79265 has some disadvantages:

  • It exposes internals such as cxx_contention_t or the fact that __libcpp_thread_poll_with_backoff needs two functions to higher level constructs such as semaphore.
  • It doesn't prevent consumers calling __cxx_atomic_wait in an error prone way, i.e. by providing to it a predicate that doesn't take an argument. This makes ABA style issues more likely to appear.

Now, __cxx_atomic_wait_fn takes just one function, which is then transformed into the poll and backoff callables needed by __libcpp_thread_poll_with_backoff.

Aside from the __cxx_atomic_wait changes, the only other change is the weakening of the initial atomic load of semaphore's try_acquire into memory_order_relaxed and the CAS inside the loop is changed from strong to weak. Both weakenings should be fine, since the CAS is called in a loop, and the "acquire" semantics of try_acquire come from the CAS, not from the initial load.


Full diff: https://github.com/llvm/llvm-project/pull/80596.diff

5 Files Affected:

  • (modified) libcxx/include/__atomic/atomic_sync.h (+69-22)
  • (modified) libcxx/include/latch (+9-2)
  • (modified) libcxx/include/semaphore (+14-15)
  • (modified) libcxx/src/atomic.cpp (+1)
  • (added) libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp (+64)
diff --git a/libcxx/include/__atomic/atomic_sync.h b/libcxx/include/__atomic/atomic_sync.h
index 93527958b2e1c..1eca545afb99f 100644
--- a/libcxx/include/__atomic/atomic_sync.h
+++ b/libcxx/include/__atomic/atomic_sync.h
@@ -43,17 +43,55 @@ __libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile*);
 _LIBCPP_AVAILABILITY_SYNC _LIBCPP_EXPORTED_FROM_ABI void
 __libcpp_atomic_wait(__cxx_atomic_contention_t const volatile*, __cxx_contention_t);
 
-template <class _Atp, class _Fn>
+template <class _Atp, class _Poll>
+struct __libcpp_atomic_wait_poll_impl {
+  _Atp* __a_;
+  _Poll __poll_;
+  memory_order __order_;
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool operator()() const {
+    auto __current_val = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__current_val);
+  }
+};
+
+template <class _Atp, class _Poll>
 struct __libcpp_atomic_wait_backoff_impl {
-  _Atp* __a;
-  _Fn __test_fn;
+  _Atp* __a_;
+  _Poll __poll_;
+  memory_order __order_;
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool
+  __poll_or_get_monitor(__cxx_atomic_contention_t const volatile*, __cxx_contention_t& __monitor) const {
+    // In case the atomic can be waited on directly, the monitor value is just
+    // the value of the atomic.
+    // `__poll_` takes the current value of the atomic as an in-out argument
+    // to potentially modify it. After it returns, `__monitor` has a value
+    // which can be safely waited on by `std::__libcpp_atomic_wait` without any
+    // ABA style issues.
+    __monitor = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__monitor);
+  }
+
+  _LIBCPP_AVAILABILITY_SYNC
+  _LIBCPP_HIDE_FROM_ABI bool __poll_or_get_monitor(void const volatile*, __cxx_contention_t& __monitor) const {
+    // In case we must wait on an atomic from the pool, the monitor comes from
+    // `std::__libcpp_atomic_monitor`.
+    // Only then we may read from `__a_`. This is the "event count" pattern.
+    __monitor          = std::__libcpp_atomic_monitor(__a_);
+    auto __current_val = __cxx_atomic_load(__a_, __order_);
+    return __poll_(__current_val);
+  }
+
   _LIBCPP_AVAILABILITY_SYNC
   _LIBCPP_HIDE_FROM_ABI bool operator()(chrono::nanoseconds __elapsed) const {
     if (__elapsed > chrono::microseconds(64)) {
-      auto const __monitor = std::__libcpp_atomic_monitor(__a);
-      if (__test_fn())
+      __cxx_contention_t __monitor;
+      if (__poll_or_get_monitor(__a_, __monitor))
         return true;
-      std::__libcpp_atomic_wait(__a, __monitor);
+      std::__libcpp_atomic_wait(__a_, __monitor);
     } else if (__elapsed > chrono::microseconds(4))
       __libcpp_thread_yield();
     else {
@@ -62,10 +100,20 @@ struct __libcpp_atomic_wait_backoff_impl {
   }
 };
 
-template <class _Atp, class _Fn>
-_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp* __a, _Fn&& __test_fn) {
-  __libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_Fn> > __backoff_fn = {__a, __test_fn};
-  return std::__libcpp_thread_poll_with_backoff(__test_fn, __backoff_fn);
+// The semantics of this function are similar to `atomic`'s
+// `.wait(T old, std::memory_order order)`, but instead of having a hardcoded
+// predicate (is the loaded value unequal to `old`?), the predicate function is
+// specified as an argument. The loaded value is given as an in-out argument to
+// the predicate. If the predicate function returns `true`,
+// `_cxx_atomic_wait_fn` will return. If the predicate function returns
+// `false`, it must set the argument to its current understanding of the atomic
+// value. The predicate function must not return `false` spuriously.
+template <class _Atp, class _Poll>
+_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
+__cxx_atomic_wait_fn(_Atp* __a, _Poll&& __poll, memory_order __order) {
+  __libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn       = {__a, __poll, __order};
+  __libcpp_atomic_wait_backoff_impl<_Atp, __decay_t<_Poll> > __backoff_fn = {__a, __poll, __order};
+  (void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __backoff_fn);
 }
 
 #else // _LIBCPP_HAS_NO_THREADS
@@ -74,9 +122,10 @@ template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_impl<_Tp> const volatile*) {}
 template <class _Tp>
 _LIBCPP_HIDE_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_impl<_Tp> const volatile*) {}
-template <class _Atp, class _Fn>
-_LIBCPP_HIDE_FROM_ABI bool __cxx_atomic_wait(_Atp*, _Fn&& __test_fn) {
-  return std::__libcpp_thread_poll_with_backoff(__test_fn, __spinning_backoff_policy());
+template <class _Atp, class _Poll>
+_LIBCPP_HIDE_FROM_ABI void __cxx_atomic_wait_fn(_Atp*, _Poll&& __poll, memory_order __order) {
+  __libcpp_atomic_wait_poll_impl<_Atp, __decay_t<_Poll> > __poll_fn = {__a, __poll, __order};
+  (void)std::__libcpp_thread_poll_with_backoff(__poll_fn, __spinning_backoff_policy());
 }
 
 #endif // _LIBCPP_HAS_NO_THREADS
@@ -86,21 +135,19 @@ _LIBCPP_HIDE_FROM_ABI bool __cxx_nonatomic_compare_equal(_Tp const& __lhs, _Tp c
   return std::memcmp(std::addressof(__lhs), std::addressof(__rhs), sizeof(_Tp)) == 0;
 }
 
-template <class _Atp, class _Tp>
-struct __cxx_atomic_wait_test_fn_impl {
-  _Atp* __a;
+template <class _Tp>
+struct __cxx_atomic_wait_poll_fn_impl {
   _Tp __val;
-  memory_order __order;
-  _LIBCPP_HIDE_FROM_ABI bool operator()() const {
-    return !std::__cxx_nonatomic_compare_equal(std::__cxx_atomic_load(__a, __order), __val);
+  _LIBCPP_HIDE_FROM_ABI bool operator()(_Tp& __current_val) const {
+    return !std::__cxx_nonatomic_compare_equal(__current_val, __val);
   }
 };
 
 template <class _Atp, class _Tp>
-_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
+_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
 __cxx_atomic_wait(_Atp* __a, _Tp const __val, memory_order __order) {
-  __cxx_atomic_wait_test_fn_impl<_Atp, _Tp> __test_fn = {__a, __val, __order};
-  return std::__cxx_atomic_wait(__a, __test_fn);
+  __cxx_atomic_wait_poll_fn_impl<_Tp> __poll_fn = {__val};
+  std::__cxx_atomic_wait_fn(__a, __poll_fn, __order);
 }
 
 _LIBCPP_END_NAMESPACE_STD
diff --git a/libcxx/include/latch b/libcxx/include/latch
index ad7b35579913b..45c1b5c91e6cd 100644
--- a/libcxx/include/latch
+++ b/libcxx/include/latch
@@ -97,9 +97,13 @@ public:
     if (__old == __update)
       __a_.notify_all();
   }
-  inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept { return 0 == __a_.load(memory_order_acquire); }
+  inline _LIBCPP_HIDE_FROM_ABI bool try_wait() const noexcept {
+    auto __value = __a_.load(memory_order_acquire);
+    return try_wait_impl(__value);
+  }
   inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void wait() const {
-    __cxx_atomic_wait(&__a_.__a_, [this]() -> bool { return try_wait(); });
+    __cxx_atomic_wait_fn(
+        &__a_.__a_, [this](ptrdiff_t& __value) -> bool { return try_wait_impl(__value); }, memory_order_acquire);
   }
   inline _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void arrive_and_wait(ptrdiff_t __update = 1) {
     _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(__update >= 0, "latch::arrive_and_wait called with a negative value");
@@ -108,6 +112,9 @@ public:
     count_down(__update);
     wait();
   }
+
+private:
+  inline _LIBCPP_HIDE_FROM_ABI bool try_wait_impl(ptrdiff_t& __value) const noexcept { return __value == 0; }
 };
 
 _LIBCPP_END_NAMESPACE_STD
diff --git a/libcxx/include/semaphore b/libcxx/include/semaphore
index 649705f45b049..27e5951e7d982 100644
--- a/libcxx/include/semaphore
+++ b/libcxx/include/semaphore
@@ -54,6 +54,7 @@ using binary_semaphore = counting_semaphore<1>;
 #include <__assert> // all public C++ headers provide the assertion handler
 #include <__atomic/atomic_base.h>
 #include <__atomic/atomic_sync.h>
+#include <__atomic/contention_t.h>
 #include <__atomic/memory_order.h>
 #include <__availability>
 #include <__chrono/time_point.h>
@@ -94,35 +95,33 @@ public:
     auto __old = __a_.fetch_add(__update, memory_order_release);
     _LIBCPP_ASSERT_ARGUMENT_WITHIN_DOMAIN(
         __update <= _LIBCPP_SEMAPHORE_MAX - __old, "update is greater than the expected value");
-
-    if (__old > 0) {
-      // Nothing to do
-    } else if (__update > 1)
+    if (__old == 0) {
       __a_.notify_all();
-    else
-      __a_.notify_one();
+    }
   }
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void acquire() {
-    auto const __test_fn = [this]() -> bool {
-      auto __old = __a_.load(memory_order_relaxed);
-      return (__old != 0) && __a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
-    };
-    __cxx_atomic_wait(&__a_.__a_, __test_fn);
+    __cxx_atomic_wait_fn(
+        &__a_.__a_, [this](ptrdiff_t& __old) { return __try_acquire_impl(__old); }, memory_order_relaxed);
   }
   template <class _Rep, class _Period>
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool
   try_acquire_for(chrono::duration<_Rep, _Period> const& __rel_time) {
     if (__rel_time == chrono::duration<_Rep, _Period>::zero())
       return try_acquire();
-    auto const __test_fn = [this]() { return try_acquire(); };
-    return std::__libcpp_thread_poll_with_backoff(__test_fn, __libcpp_timed_backoff_policy(), __rel_time);
+    auto const __poll_fn = [this]() { return try_acquire(); };
+    return std::__libcpp_thread_poll_with_backoff(__poll_fn, __libcpp_timed_backoff_policy(), __rel_time);
   }
   _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool try_acquire() {
-    auto __old = __a_.load(memory_order_acquire);
+    auto __old = __a_.load(memory_order_relaxed);
+    return __try_acquire_impl(__old);
+  }
+
+private:
+  _LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI bool __try_acquire_impl(ptrdiff_t& __old) {
     while (true) {
       if (__old == 0)
         return false;
-      if (__a_.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
+      if (__a_.compare_exchange_weak(__old, __old - 1, memory_order_acquire, memory_order_relaxed))
         return true;
     }
   }
diff --git a/libcxx/src/atomic.cpp b/libcxx/src/atomic.cpp
index 2f0389ae6974a..2b67685c8a0a1 100644
--- a/libcxx/src/atomic.cpp
+++ b/libcxx/src/atomic.cpp
@@ -178,6 +178,7 @@ _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_contention_t
 _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_contention_t const volatile* __location) {
   __libcpp_contention_notify(&__libcpp_contention_state(__location)->__contention_state, __location, false);
 }
+// This function is never used, but still exported for ABI compatibility.
 _LIBCPP_EXPORTED_FROM_ABI __cxx_contention_t
 __libcpp_atomic_monitor(__cxx_atomic_contention_t const volatile* __location) {
   return __libcpp_contention_monitor_for_wait(&__libcpp_contention_state(__location)->__contention_state, __location);
diff --git a/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp b/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp
new file mode 100644
index 0000000000000..dca3c01cf0103
--- /dev/null
+++ b/libcxx/test/std/thread/thread.semaphore/lost_wakeup.pass.cpp
@@ -0,0 +1,64 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: no-threads
+// UNSUPPORTED: c++03, c++11, c++14, c++17
+
+// XFAIL: availability-synchronization_library-missing
+
+// This is a regression test for https://llvm.org/PR47013.
+
+// <semaphore>
+
+#include <barrier>
+#include <semaphore>
+#include <thread>
+#include <vector>
+
+#include "make_test_thread.h"
+
+static std::counting_semaphore<> s(0);
+static std::barrier<> b(8 + 1);
+
+void acquire() {
+  for (int i = 0; i < 10'000; ++i) {
+    s.acquire();
+    b.arrive_and_wait();
+  }
+}
+
+void release() {
+  for (int i = 0; i < 10'000; ++i) {
+    s.release(1);
+    s.release(1);
+    s.release(1);
+    s.release(1);
+
+    s.release(1);
+    s.release(1);
+    s.release(1);
+    s.release(1);
+
+    b.arrive_and_wait();
+  }
+}
+
+int main(int, char**) {
+  for (int run = 0; run < 20; ++run) {
+    std::vector<std::thread> threads;
+    for (int i = 0; i < 8; ++i)
+      threads.push_back(support::make_test_thread(acquire));
+
+    threads.push_back(support::make_test_thread(release));
+
+    for (auto& thread : threads)
+      thread.join();
+  }
+
+  return 0;
+}

@jiixyj jiixyj marked this pull request as draft February 4, 2024 13:49
@jiixyj
Copy link
Contributor Author

jiixyj commented Feb 4, 2024

As this is based on #79265 , which is not yet merged, I converted this into a draft PR.

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch and sorry it took so long to get to it, the last few weeks have been quite busy. I appreciate that you spent time figuring out how to refactor this code and make it less prone to bugs -- it's important since this code is really complex.

I generally like the approach. I have a few comments and questions but I think we could move forward with this, as it does make the code better and also opens the door to more refactoring like #81427.

I must say I kinda liked the approach suggested by @huixie90 in #79265 where he refactored latch and barrier to use atomic::wait directly, but I guess we can move forward with this patch and then do that other approach in case we end up getting consensus to do it.

I would like @huixie90 to stamp this patch before I give it a last LGTM.


_LIBCPP_AVAILABILITY_SYNC
_LIBCPP_HIDE_FROM_ABI bool
__poll_or_get_monitor(__cxx_atomic_contention_t const volatile*, __cxx_contention_t& __monitor) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this is roughly what @huixie90 is transforming into a CPO in #81427 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the two __poll_or_get_monitor functions here are basically the same thing as the two __update_monitor_val_and_poll functions from #81427 (here).

__a, {__poll}};
return std::__libcpp_thread_poll_with_backoff(__poll, __backoff_fn);
_LIBCPP_AVAILABILITY_SYNC _LIBCPP_HIDE_FROM_ABI void
__cxx_atomic_wait_fn(_Atp* __a, _Poll&& __poll, memory_order __order) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often use _fn to denote things that are implementation details or function objects in the library. IMO, keeping this function named __cxx_atomic_wait would be better.

Copy link
Contributor Author

@jiixyj jiixyj Feb 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We often use _fn to denote things that are implementation details or function objects in the library.

Thanks, that is good to know!

IMO, keeping this function named __cxx_atomic_wait would be better.

I agree. It would conflict with the "normal" __cxx_atomic_wait though, as the _Poll type parameter is unconstrained. Instead of doing some SFINAE hacks I renamed it to __cxx_atomic_wait_unless in anticipation of #81427.

struct __cxx_atomic_wait_test_fn_impl {
_Atp* __a;
template <class _Tp>
struct __cxx_atomic_wait_poll_fn_impl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming this to __atomic_compare_equal_to or something like that -- that's still not a great name but it is a bit more descriptive, and in the future we could probably write this as a lambda (which is what would really make this code easier to read).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a lambda would definitely help. I renamed this struct to __atomic_compare_unequal_to. So the logic can be read as "wait unless/until the atomic variable compares unequal to val" (similar to the wording in the spec: https://eel.is/c++draft/atomics#ref.ops-23.2).

@@ -178,6 +178,7 @@ _LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_one(__cxx_atomic_contention_t
_LIBCPP_EXPORTED_FROM_ABI void __cxx_atomic_notify_all(__cxx_atomic_contention_t const volatile* __location) {
__libcpp_contention_notify(&__libcpp_contention_state(__location)->__contention_state, __location, false);
}
// This function is never used, but still exported for ABI compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because this is handled by __poll_or_get_monitor now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. This overload of __libcpp_atomic_monitor was used to get the "monitor" value in case the ptrdiff_t of the semaphore could be waited on directly. And this was the reason for the original ABA bug, where this could happen:

  1. __libcpp_atomic_monitor returns "1" (i.e. the semaphore has the value 1)
  2. another thread swoops in, takes the "1" down to "0"
  3. "our" poll function returns "false", because there is nothing left to try_acquire
  4. another thread releases the semaphore (increases the value from "0" to "1")
  5. atomic_wait now waits on the "1", resulting in a deadlock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And now, __poll_or_get_monitor makes sure that in this case, the "monitor" value is always "0" (because return __poll_(__monitor); sets __monitor to "0" if it returns false).

@huixie90
Copy link
Contributor

LGTM with comments from Louis resolved.

I could continue the discussion on whether to refactor the code to use public wait API after this lands.
also just FYI, I have further refactoring to enable these functions reusable by atomic_ref (which build on top of this patch)
#81427

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I am going OOO for a week and I don't want to block this on my approval, please go ahead with the merge once you feel like my comments/questions have been resolved/answered. Don't wait for me to stamp again.

Thanks for the patch, again.

@jiixyj jiixyj force-pushed the atomic-wait-fixes-2 branch from aced309 to 4ef3415 Compare February 18, 2024 11:26
@jiixyj jiixyj force-pushed the atomic-wait-fixes-2 branch from 4ef3415 to f151e17 Compare February 18, 2024 11:41
@jiixyj jiixyj force-pushed the atomic-wait-fixes-2 branch from 9a3f43e to 2daeb53 Compare February 18, 2024 11:55
@jiixyj
Copy link
Contributor Author

jiixyj commented Feb 18, 2024

Thank you @huixie90 and @ldionne for the review! From my side this is good to go.

@huixie90 huixie90 merged commit 95ebf2b into llvm:main Feb 19, 2024
huixie90 added a commit that referenced this pull request Mar 2, 2024
#81427)

The goal of this patch is to make `atomic`'s wait functions to be
reusable by `atomic_ref`.
#76647

First, this patch is built on top of
#80596 , to reduce the future
merge conflicts.

This patch made the following functions as "API"s to be used by
`atomic`, `atomic_flag`, `semaphore`, `latch`, `atomic_ref`

```
__atomic_wait
__atomic_wait_unless
__atomic_notify_one
__atomic_notify_all
```

These functions are made generic to support `atomic` type and
`atomic_ref`. There are two customisation points.

```
// How to load the value from the given type (with a memory order)
__atomic_load
```


```
// what is the contention address that the platform `wait` function is going to monitor
__atomic_contention_address
```


For `atomic_ref` (not implemented in this patch), the `load` and
`address` function will be different, because
- it does not use the "atomic abstraction layer" so the `load` operation
will be some gcc builtin
- the contention address will be the user's actual type that the
`atomic_ref` is pointing to
blueboxd pushed a commit to blueboxd/libcxx that referenced this pull request May 4, 2024
…f (#81427)

The goal of this patch is to make `atomic`'s wait functions to be
reusable by `atomic_ref`.
llvm/llvm-project#76647

First, this patch is built on top of
llvm/llvm-project#80596 , to reduce the future
merge conflicts.

This patch made the following functions as "API"s to be used by
`atomic`, `atomic_flag`, `semaphore`, `latch`, `atomic_ref`

```
__atomic_wait
__atomic_wait_unless
__atomic_notify_one
__atomic_notify_all
```

These functions are made generic to support `atomic` type and
`atomic_ref`. There are two customisation points.

```
// How to load the value from the given type (with a memory order)
__atomic_load
```

```
// what is the contention address that the platform `wait` function is going to monitor
__atomic_contention_address
```

For `atomic_ref` (not implemented in this patch), the `load` and
`address` function will be different, because
- it does not use the "atomic abstraction layer" so the `load` operation
will be some gcc builtin
- the contention address will be the user's actual type that the
`atomic_ref` is pointing to

NOKEYCHECK=True
GitOrigin-RevId: 1f613bce19ea78789934b2a47be8c6a13925f0fa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants