Skip to content

[BUG] GIL hang in multi-threaded situation #2888

@daltairwalter

Description

@daltairwalter

I am seeing a gil hang issue in multi-threaded situations when a worker thread initializes pybind11 (2.6.2) while operating inside of a PyGILState_Ensure/PyGILState_Release block. The PyGILState_Release is deleting the PyThreadState and the internal tstate ends up holding a dangling pointer.

This seems to be a partially known issue within pybind11 from the following comment I see in pybind11.h:
/* Check if the GIL was acquired using the PyGILState_* API instead (e.g. if
calling from a Python thread). Since we use a different key, this ensures
we don't create a new thread state and deadlock in PyEval_AcquireThread
below. Note we don't save this state with internals.tstate, since we don't
create it we would fail to clear it (its reference count should be > 0). */
tstate = PyGILState_GetThisThreadState();

This works for all threads that don’t initialize pybind11 because they don’t start off with a stored tstate.

I have tried changing internals.h to store nullptr instead of tstate and this fixes the dangling pointer problems that I am seeing. I imagine that historically saving this tstate during initialization was somehow thought of as a performance advantage, though today as gil_scoped_acquire always either creates a new tstate or calls detail::get_thread_state_unchecked(), it is unclear how any performance advantage is achieved. If performance is a significant concern here, it would also be advisable to reduce the redundancy of calling both PyGILState_GetThisThreadState and detail::get_thread_state_unchecked.

This dangling tstate problem can happen in both officially supported and not officially supported situations. As an officially supported example would be far more complicated, I am going to start with a non-officially supported example. If necessary, examples can be made using multiple pyd’s and using the python executable rather than embedding this – though I see no reason to make life more difficult for everyone this way.

`#include <Python.h>
#include
#include <pybind11/pybind11.h>
#include

namespace py = pybind11;

void causeHang()
{
auto state = PyGILState_Ensure();
{
py::gil_scoped_acquire thisHangsTheSecondTime;
}
PyGILState_Release(state);
}

void runIt()
{
for (size_t i = 0; i < 100; ++i)
{
std::cout << "trying " << i << std::endl;
causeHang();
}
}

int main()
{
Py_Initialize();
PyEval_InitThreads();
auto saved = PyEval_SaveThread();
std::thread(runIt).join();
PyEval_RestoreThread(saved);
Py_Finalize();
}
`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions