-
Notifications
You must be signed in to change notification settings - Fork 26
Description
I have commented on the relevant commit (#16), but too late as the commit was merged so I thought I would open an issue so this can be tracked.
Admittedly, I don't know what exactly is the role of the fence here. This fence does not exist in Dmitry Vyukov's original implementation of the queue, so I guess it was added as part of the modifications that ensure that this queue is linearisable (unlike the original queue).
That being said, if the cross-platform solution is indeed to place the load before the fence (this, I do not know) then I am pretty sure that the intel specialization that uses a lock operation instead of an mfence should also keep the load before.
I did look at https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html but could not see where it states that lock + mov (in this order) is equivalent to mov + mfence. In fact, the latest GCC does use the lock optimization and definitely preserves the order, i.e. mov + lock (see this godbolt: https://godbolt.org/z/o3rYdTvYv).