-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
shutdown (exit) can hang or segfault with daemon threads running #46164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This probably applies to 3.0 as well but i have not tested it there. Here are some sample failures: =========== A ============== Program received signal SIGSEGV, Segmentation fault. =========== B ============== Exception in thread Thread-0000000001 (most likely raised during
interpreter shutdown):
Traceback (most recent call last):
File "/home/gps/oss/python/trunk/Lib/threading.py", line 486, in
__bootstrap_inner
File "thread_exit.py", line 24, in run
<type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'add' Program received signal SIGINT, Interrupt. And as with all problems of this sort... sometimes the program exits I ran python trunk:60012 under gdb above. But these problems occur on |
So can we definitely rule out that this could be caused by the recent |
Yes i believe it is unrelated to any recent change. I can reproduce both behaviors on my OS X 10.4 dual core mac using Python 2.3 on the mac appears to get stuck in a loop when run stand python 2.4.4 seems to hang most of the time. (all behaviors are possible i expect, i just ran it by hand under The systems i ran it on when reporting the bug was SMP. As with many |
I uploaded a script for a similar issue: I explain it this way:
On interpreter shutdown, the main thread clears the other's thread
TreadState. There you find the instruction:
Py_CLEAR(tstate->frame);
But this can call arbitrary python code on deallocation of locals, and
release the GIL (file.close() in our case).
The other thread can then continue to run. If it is given enough
processor time before the whole process shutdowns, it will reach this
line in ceval.c (line 2633)
if (tstate->frame->f_exc_type != NULL)
since tstate->frame has been cleared by the main thread ==> boom. There can be several ways to solve this problem:
|
I think non-main threads should kill themselves off if they grab the PyGILState_Ensure would still be broken. It touches various things that |
agreed, during shutdown the other threads should be stopped. anything brainstorm: I haven't looked at the existing BEGIN_ALLOW_THREADS and |
I'm not sure I understand you, Gregory. Are arguing in favour of adding I'm attaching a patch that has non-main thread exit, and it seems to fix Also note that PyThread_exit_thread() was completely broken, becoming a |
Adam, did you notice the change on revision 59231 ? the |
Adam, your patch cover one case of the thread releasing the GIL I have a competing patch: it makes the main thread never release the GIL Both approaches correct the initial problem, though. |
That doesn't matter. PyGILState_Ensure needs to remain valid *forever*. Note that PyGILState_Ensure has two behaviours: it can be called when ... You're right, I did forget the 3 other places that acquire the I think the banning should be as early as possible, right after |
We could apply the same idea: when exiting, PyGILState_Ensure() blocks Note that all this state must be restartable: after Py_Finalize(), it |
PyGILState_Ensure WOULD block forever if it acquired the GIL before The only way to make Py_Initialize callable after Py_Finalize is to make Note that unloading python.so between Py_Finalize and Py_Initialize |
|
Hrm. It seems you're right. Python needs thread-local data to |
Cleaned up version of Amaury's patch. I stop releasing the GIL after I also grab the import lock (and never release it). This should prevent Importing raises a potential issue with this approach. The standard |
Closed bpo-2077 as a duplicate. |
The threads don't have to be daemons: sys.exit() calls Py_Finalize() |
I think applying Rhamphoryncus' patch in bpo-1722344 fixes this too (that |
Is this still an issue in 2.7 or 3.x? |
It is certainly still a problem with 3.x, but I don't find a way to exhibit it here. |
See bpo-9901 for a variation on the same global issue (running threads can access interpreter structures - the GIL - while the main thread is trying to destroy them). |
With Python 3.3, thread_exit.py still crashes more or less randomly: $ ./python thread_exit.py
[49205 refs]
python: Python/_warnings.c:501: setup_context: Assertion `((((((PyObject*)(*filename))->ob_type))->tp_flags & ((1L<<28))) != 0)' failed.
Abandon |
A patch that seems to work under Linux and Windows (for 3.2/3.3). |
Victor pointed out that Py_Finalize() is not necessarily called in the main Python thread. This new patch records the thread state of the finalizing thread, and also includes a test case. |
New changeset 2a19d09b08f8 by Antoine Pitrou in branch '3.2': New changeset c892b0321d23 by Antoine Pitrou in branch 'default': |
Should be fixed in 3.2 and 3.3 now. I don't really want to bother with 2.7 and 3.1 (the GIL implementation is different), but someone can backport the patch if they want to :) |
There are some examples to work around this for Python2: http://stackoverflow.com/questions/18098475/detect-interpreter-shut-down-in-daemon-thread |
New changeset 7741d0dd66ca by Benjamin Peterson in branch '2.7': |
http://tracker.ceph.com/issues/8797 reports that the backport to 2.7 causes a regression in ceph. |
I've opened bpo-21963 to track the 2.7.8 regression. Please continue any discussion there. |
New changeset 4ceca79d1c63 by Antoine Pitrou in branch '2.7': |
bpo-1193099 was marked as a duplicate of this issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: