-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[BUG] GIL hang in multi-threaded situation #2888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@daltairwalter This situation should be improved in the latest PyBind11 release |
I will test again with 2.8.0, but I didn't see any changes that I expect to make a difference. It still looks like the tstate that is stored in the TLS during the initial creation of internals is a dangling pointer. If there is a specific change with this, I would be happy to look at it. |
Was referring to this PR: #3237 and it's subsequent PRs, although it sounds like it may not solve this issue. |
We are embedding Python and we have a few threads that we re-use for code evaluations. We were wrapping each evaluation in pybind11/include/pybind11/gil.h Line 82 in d8565ac
I believe this matches what @daltairwalter described. During the first The workaround we ended up is to create a thread state for each of our threads using |
I have the exact same crash with several pybind11 packages when using CPython as an embedded library as @jonatanklosko explained. Debugging this, pybind11 is calling
|
When I originally created this ticket, it was possible to make this happen in non-embedded scenarios. I think it required a well behaved non-pybind11 pyd that did specific legitimate things in order to make this fail though. |
As stated in the initial message, there is a relatively straightforward fix for this in pybind11. In internals.h: Just remove the line that stores the initial tstate that is dangling: The performance cost for this is that pybind11 would perform two TLS lookups in gil acquisition instead of just one for the cases where the main thread is being used. If TLS lookups are really expensive and this main thread is really the primary use case, it would also be possible to reorder the TLS lookups in gil acquire. It would be nice to get a fix for this into pybind11, but the process for this seems fairly difficult from what I have seen with the smart holder branch and other important features - I am not sure where the process would even begin for this. |
I am seeing a gil hang issue in multi-threaded situations when a worker thread initializes pybind11 (2.6.2) while operating inside of a PyGILState_Ensure/PyGILState_Release block. The PyGILState_Release is deleting the PyThreadState and the internal tstate ends up holding a dangling pointer.
This seems to be a partially known issue within pybind11 from the following comment I see in pybind11.h:
/* Check if the GIL was acquired using the PyGILState_* API instead (e.g. if
calling from a Python thread). Since we use a different key, this ensures
we don't create a new thread state and deadlock in PyEval_AcquireThread
below. Note we don't save this state with internals.tstate, since we don't
create it we would fail to clear it (its reference count should be > 0). */
tstate = PyGILState_GetThisThreadState();
This works for all threads that don’t initialize pybind11 because they don’t start off with a stored tstate.
I have tried changing internals.h to store nullptr instead of tstate and this fixes the dangling pointer problems that I am seeing. I imagine that historically saving this tstate during initialization was somehow thought of as a performance advantage, though today as gil_scoped_acquire always either creates a new tstate or calls detail::get_thread_state_unchecked(), it is unclear how any performance advantage is achieved. If performance is a significant concern here, it would also be advisable to reduce the redundancy of calling both PyGILState_GetThisThreadState and detail::get_thread_state_unchecked.
This dangling tstate problem can happen in both officially supported and not officially supported situations. As an officially supported example would be far more complicated, I am going to start with a non-officially supported example. If necessary, examples can be made using multiple pyd’s and using the python executable rather than embedding this – though I see no reason to make life more difficult for everyone this way.
`#include <Python.h>
#include
#include <pybind11/pybind11.h>
#include
namespace py = pybind11;
void causeHang()
{
auto state = PyGILState_Ensure();
{
py::gil_scoped_acquire thisHangsTheSecondTime;
}
PyGILState_Release(state);
}
void runIt()
{
for (size_t i = 0; i < 100; ++i)
{
std::cout << "trying " << i << std::endl;
causeHang();
}
}
int main()
{
Py_Initialize();
PyEval_InitThreads();
auto saved = PyEval_SaveThread();
std::thread(runIt).join();
PyEval_RestoreThread(saved);
Py_Finalize();
}
`
The text was updated successfully, but these errors were encountered: