Skip to content

bpo-39995: Fix concurrent.futures _ThreadWakeup #19760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 29, 2020
Merged

bpo-39995: Fix concurrent.futures _ThreadWakeup #19760

merged 1 commit into from
Apr 29, 2020

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Apr 28, 2020

Fix a race condition in concurrent.futures._ThreadWakeup: access to
_ThreadWakeup is now protected with the shutdown lock.

https://bugs.python.org/issue39995

Fix a race condition in concurrent.futures._ThreadWakeup: access to
_ThreadWakeup is now protected with the shutdown lock.
@@ -718,12 +733,12 @@ def shutdown(self, wait=True, *, cancel_futures=False):
with self._shutdown_lock:
self._cancel_pending_futures = cancel_futures
self._shutdown_thread = True
if self._executor_manager_thread_wakeup is not None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced if self._executor_manager_thread: with if self._executor_manager_thread_wakeup is not None:. Or was the previous code written on purpose?

cc @pitrou @tomMoral

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the same as both are set to None in the same call to shutdown. (L.744/750)
This is more explicit this way so perfect.

@vstinner
Copy link
Member Author

This PR is based on @tomMoral (Thomas Moreau) idea:
https://bugs.python.org/issue39995#msg367503

One solution would be to use the self._shutdown_lock from the executor to protect the call to close in terminate_broken and the call to self._thread_wakeup.wakeup in shutdown. That way, the lock is only acquired at critical points without being used all the time. This could also be done by adding lock=True/False to only lock the potentially dangerous calls.

I wrote a conservative change which always lock ProcessPoolExecutor._shutdown_lock while accessing _ThreadWakeup because I don't know the concurrent.futures module. Maybe we can avoid the lock in some cases.

@vstinner
Copy link
Member Author

@pitrou @tomMoral: https://bugs.python.org/issue39995 is becoming more and more annoying for the Python CI. I suggest to start with a conservative approach to unblock https://bugs.python.org/issue39995 and then investigate where we can avoid locking.

Right now, tons of test_concurrent_futures tests are failing randomly and it's becoming hard to handle all these failures and identify root issues.

@vstinner
Copy link
Member Author

I tested manually: this PR fix test_killed_child() and ProcessPoolForkExecutorDeadlockTest tests.

Using my https://bugs.python.org/msg367463 patch (sleep), without this change: it is very easy to reproduce the bug. With this change, tests no longer fail.

In short, with https://bugs.python.org/msg367463 patch (sleep) and this change, only test_cancel_futures() fails: but that's expected, see https://bugs.python.org/issue39995#msg367544

@vstinner vstinner merged commit a4dfe8e into python:master Apr 29, 2020
@vstinner vstinner deleted the futures_shutdown_lock branch April 29, 2020 01:32
@@ -380,7 +389,9 @@ def wait_result_broken_or_wakeup(self):

elif wakeup_reader in ready:
is_broken = False
self.thread_wakeup.clear()

with self.shutdown_lock:
Copy link
Contributor

@tomMoral tomMoral Apr 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this lock is necessary as the goal is to protect against _ThreadWakeup.close which should only be called in the same thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants