Skip to content

Conversation

@polytypic
Copy link
Contributor

I realized my previous attempt suffered from pretty much the same race condition as the earlier implementations. 😞

The logic of the new tail update algorithm is as follows:

The tail update is only responsible for updating the tail as long as the new
node added still has the true tail. At any point some newly added node has the
true tail, so this responsibility ensures that the tail gets updated. To check
whether we are still responsible, we compare: get new_tail == Nil. If that
is false, then someone else is responsible and we can stop. The current
old_tail must be read before that check. If the compare_and_set fails, we
must retry. Otherwise we can stop as we've done our part.

See also related kcas PR.

I realized my previous attempt suffered from pretty much the same race condition
as the earlier implementations.

The logic of the new tail update algorithm is as follows:

> The tail update is only responsible for updating the tail as long as the new
> node added still has the true tail. At any point some newly added node has the
> true tail, so this responsibility ensures that the tail gets updated. To check
> whether we are still responsible, we compare: `get new_tail == Nil`. If that
> is false, then someone else is responsible and we can stop. The current
> old_tail must be read before that check. If the `compare_and_set` fails, we
> must retry. Otherwise we can stop as we've done our part.
@polytypic polytypic requested a review from lyrm March 21, 2023 10:54
match Atomic.get new_tail with
| Nil -> ()
| Next (_, new_new_tail) -> fix_tail tail new_tail new_new_tail
if not (Atomic.compare_and_set tail old_tail new_tail) then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about this CAS. It seems to me that :

  • it works with or without it
  • if a lot of concurrent push are done, without is better as most of the push operations may call fix_tail anyway
  • otherwise, it is better with it as it avoids 2 extra Atomic.get call.

Am I getting it right ?

Copy link
Contributor Author

@polytypic polytypic Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds roughly about right.

When thinking about performance, I would recommend trying to think about these in terms of the cache coherency protocols rather than atomic operations. See e.g. the Simplified MSI model of shared memory that I wrote. What is costly is the cache line transfers. The atomic operations, as such, are cheap. So, for example, a compare_and_set to a cache line that is already in the L1 cache of the core that is performing the CAS is quite cheap. Any access, get or compare_and_set, to a cache line that is not in the L1 cache is expensive.

But about the code. Basically, at that point we have the same information as at the point in fix_tail that performs the compare_and_set. In other words, we've read the tail and we knew the node we inserted had the true tail. So we have roughly just inlined the first iteration of fix_tail. This also helps in that we avoid calling a recursive function.

There is a small possibility that

  1. some other domain has updated the tail (i.e. a previous push (or next push)) and even smaller possibility that
  2. some other domain has managed to push another node into the queue (i.e. next push) and yet smaller possibility that
  3. both of the above have happened.

In the first scenario (a previous push) we still need to get exclusive ownership of the tail cache line (we are still responsible to update the tail), so optimistically attempting the compare_and_set turns out to be cheap. IOW, calling fix_tail would not be better.

In the second (and third) scenario, we could avoid getting exclusive ownerhip of the tail and potentially save a few expensive cache line transfers. However, the propability of this is small as there is a very small window of time between the compare_and_set in find_tail_and_enq and the optimistic compare_and_set to update the tail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this explanation !

@lyrm lyrm merged commit 2713111 into main May 9, 2023
@polytypic polytypic deleted the fix-ms-queue-tail-update branch November 19, 2023 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants