-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Deadlock on using ProcessPoolExecutor in fork
mp context
#105464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@AlexWaygood see you added this issue on a board. However, could you validate please whether this is a real bug that was not fixes by some multiprocessing/threading/concurrent patches in 3.9 |
Hi, I was just doing some drive-by triaging. I'm a core dev, but I don't primarily work on multiprocessing issues in CPython. I could investigate further, but I'm not a multiprocessing expert, so I'm probably not best placed to do so. I also have a number of other projects on the go at the moment. I added it to the board so that members of the team who are experts on multiprocessing will be able to find it and work on it more easily. However, please try to see if the problem reproduces on Python 3.11 or later. Python 3.9 and 3.10 are now in "security-only" mode — we won't fix bugs in those versions of Python unless they relate to a security vulnerability. |
@AlexWaygood Could reproduce on Python 3.11.2. So, maybe it's worth labeling the issue for python 3.9, 3.10, 3.11. |
Unless you can persuasively argue it's a security issue, we won't be backporting a fix to 3.10 or older, so I won't add the 3.10 or 3.9 label for now. |
A new thing I noticed: could reproduce this deadlock even on Python 3.8. Just highlighting for easier debugging. |
Hi @kekekekule , I am just wondering if bug is still present when running ProcessPoolExecuter with a mp_context as |
Spawn/forkserver -- no. |
Sorry, I don't understand if you ran your script with other context than 'fork' just to check if there is still something wrong. |
fork
mp context
Hi!
I got an issue using Python 3.9.16 and 3.9.17
Launching this code (inspired by #90622)
Causes a deadlock sometimes.
The faulthandler dumps the traceback below on timeout:
The pstree shows the following:
Running under docker launched in QEMU on Mac OS M1:
Tried to analyze under strace and suspect that the problem, here's the last output, after that it simply hangs:
Having child processes with pids 304 and 305, on interrupting my script I got from strace that it hung on wait4 of pid = 305.
304 exits normally after SIGINT, 305 not, still hangs.
Did not check versions upper than 3.9.x. However, I do not exclude that the problem can be reproducible on higher versions of Python.
UPD. Reproduced the same on Python 3.10.10:
UPD2. Just investigated, that when issue happens, one of the children isn't even introduced in strace except wait4.
UPD3. Some logging from multiprocessing:
The text was updated successfully, but these errors were encountered: