Skip to content

Fix a race condition in pthreads call proxying waiting #12243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion system/lib/pthread/library_pthread.c
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,16 @@ EMSCRIPTEN_RESULT emscripten_wait_for_call_v(em_queued_call* call, double timeou
double waitEndTime = now + timeoutMSecs;
emscripten_set_current_thread_status(EM_THREAD_STATUS_WAITPROXY);
while (!done && now < waitEndTime) {
r = emscripten_futex_wait(&call->operationDone, 0, waitEndTime - now);
// Wait at most one second in each loop iteration. That keeps CPU load low
// while also avoids a possible race condition, as call->operationDone
// can be set *after* we checked it but *before* we do the futex_wait. The
// futex_wait is not aware of call->operationDone which means if we wait
Comment on lines +422 to +424
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Isn't the whole point of futex_wait to do an atomic compare-and-block operation? Why wouldn't it be aware of call->operationDone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think you're right, while this fixes the issue the diagnosis does look wrong.

// forever there we'd never return to the loop and get another chance to
// check.
// FIXME: this does mean we can end up with an almost 1 second wait that
// is unnecessary in some cases. Exponential backoff may make more sense.
double currWait = fmin(waitEndTime - now, 1000);
r = emscripten_futex_wait(&call->operationDone, 0, currWait);
done = emscripten_atomic_load_u32(&call->operationDone);
now = emscripten_get_now();
}
Expand Down