Fix a race condition in pthread call targets not waking up. #12244

kripken · 2020-09-17T01:13:35Z

A thread may happen to have finished handling its events right before we add
another one. If it is idle, it may never handle it. To avoid that, if we wait on a call
then notify it to wake up.

To allow that, track the target thread of each proxied call, so we know who to
notify.

tlively · 2020-09-17T07:16:06Z

system/lib/pthread/library_pthread.c

+        // which in a race condition may have finished handling its event queue
+        // just after we added our event. (We could also notify it once right


How is this possible? It looks like all enqueue and dequeue operations are protected by the call_queue_lock, so the event must have been added either after the target thread finished handling its event queue and released the lock or before the target thread finished handling its event queue, in which case the event would be handled because the enqueue would synchronize with the dequeue.

I'm not entirely sure. But we get clear deadlocks without this patch, where the main thread needs to be woken up, which this patch fixes (see #12258).

It may be that there is something not entirely atomic about our mutexes, in which case I'm not sure what the best debugging approach is (maybe we need to debug the browser itself?).

It may be that there is something not entirely atomic about our mutexes

Yikes! Perhaps we could demonstrate such an issue with our mutexes in a smaller, controlled experiment? A test of a dining philosophers solution could be a good simple stress test for deadlock. It would also be good to narrow down whether the lock misbehaves on the main thread or on non-main threads.

#12258 has the smallest controlled experiment I can get so far. But it still depends on allocation, proxying, and mutexes...

kripken · 2020-09-22T23:26:19Z

I have found the actual cause here, and will open a refactoring PR and then a fix PR shortly.

kripken added 2 commits September 16, 2020 17:16

fix

c4bc21a

fix

62fbac8

kripken requested a review from juj September 17, 2020 01:13

kripken added a commit that referenced this pull request Sep 17, 2020

Add a testcase for #12243 #12244 #12245 [ci skip]

5617421

tlively reviewed Sep 17, 2020

View reviewed changes

This was referenced Sep 17, 2020

Add a testcase for pthreads race conditions #12258

Closed

Exception Handling - exception on the wrong thread when using pthreads #12035

Closed

kripken marked this pull request as draft September 22, 2020 19:42

kripken closed this Sep 22, 2020

kripken deleted the pthread2 branch September 22, 2020 23:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a race condition in pthread call targets not waking up. #12244

Fix a race condition in pthread call targets not waking up. #12244

kripken commented Sep 17, 2020

tlively Sep 17, 2020

kripken Sep 17, 2020

tlively Sep 17, 2020

kripken Sep 17, 2020

kripken commented Sep 22, 2020

		// which in a race condition may have finished handling its event queue
		// just after we added our event. (We could also notify it once right

Fix a race condition in pthread call targets not waking up. #12244

Fix a race condition in pthread call targets not waking up. #12244

Conversation

kripken commented Sep 17, 2020

tlively Sep 17, 2020

Choose a reason for hiding this comment

kripken Sep 17, 2020

Choose a reason for hiding this comment

tlively Sep 17, 2020

Choose a reason for hiding this comment

kripken Sep 17, 2020

Choose a reason for hiding this comment

kripken commented Sep 22, 2020