Skip to content

Occasional errors with free-threading #5674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
henryiii opened this issue May 20, 2025 · 2 comments
Closed

Occasional errors with free-threading #5674

henryiii opened this issue May 20, 2025 · 2 comments

Comments

@henryiii
Copy link
Collaborator

henryiii commented May 20, 2025

So far these are always on 3.14t.

ubuntu 3.14t:
ubuntu 3.13t:

The following:

test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_acquired]
test_run_in_process_multiple_threads_parallel[test_cross_module_gil_inner_pybind11_released]
test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_released]

=================================== FAILURES ===================================
_ test_run_in_process_multiple_threads_parallel[test_cross_module_gil_nested_pybind11_released] _

test_fn = <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>

    @pytest.mark.skipif(sys.platform.startswith("emscripten"), reason="Requires threads")
    @pytest.mark.parametrize("test_fn", ALL_BASIC_TESTS_PLUS_INTENTIONAL_DEADLOCK)
    @pytest.mark.skipif(
        "env.GRAALPY",
        reason="GraalPy transiently complains about unfinished threads at process exit",
    )
    def test_run_in_process_multiple_threads_parallel(test_fn):
        """Makes sure there is no GIL deadlock when running in a thread multiple times in parallel.
    
        It runs in a separate process to be able to stop and assert if it deadlocks.
        """
>       assert _run_in_process(_run_in_threads, test_fn, num_threads=8, parallel=True) == 0
E       assert -11 == 0
E        +  where -11 = _run_in_process(_run_in_threads, <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>, num_threads=8, parallel=True)

test_fn    = <function test_cross_module_gil_nested_pybind11_released at 0x2ebaf8a7500>

../../tests/test_gil_scoped.py:241: AssertionError
=============================== warnings summary ===============================
<frozen importlib._bootstrap>:491
  <frozen importlib._bootstrap>:491: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'exo_planet_c_api', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

macOS 3.14t:

______________ test_run_in_process_direct[_intentional_deadlock] _______________

test_fn = <function _intentional_deadlock at 0x2898f7a4740>

    @pytest.mark.skipif(sys.platform.startswith("emscripten"), reason="Requires threads")
    @pytest.mark.parametrize("test_fn", ALL_BASIC_TESTS_PLUS_INTENTIONAL_DEADLOCK)
    @pytest.mark.skipif(
        "env.GRAALPY",
        reason="GraalPy transiently complains about unfinished threads at process exit",
    )
    def test_run_in_process_direct(test_fn):
        """Makes sure there is no GIL deadlock when using processes.
    
        This test is for completion, but it was never an issue.
        """
>       assert _run_in_process(test_fn) == 0

test_fn    = <function _intentional_deadlock at 0x2898f7a4740>

test_gil_scoped.py:269: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

target = <function _intentional_deadlock at 0x2898f7a4740>, args = ()
kwargs = {}, test_fn = <function _intentional_deadlock at 0x2898f7a4740>
timeout = 0.1
process = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>
t_start = 1747718249.89369, t_delta = 0.10827970504760742, @py_assert1 = 0
@py_assert4 = None, @py_assert3 = False
@py_format6 = "0\n{0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode\n} is None"
@py_format8 = "assert 0\n{0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode\n} is None"

    def _run_in_process(target, *args, **kwargs):
        test_fn = target if len(args) == 0 else args[0]
        # Do not need to wait much, 10s should be more than enough.
        timeout = 0.1 if test_fn is _intentional_deadlock else 10
        process = multiprocessing.Process(target=target, args=args, kwargs=kwargs)
        process.daemon = True
        try:
            t_start = time.time()
            process.start()
            if timeout >= 100:  # For debugging.
                print(
                    "\nprocess.pid STARTED", process.pid, (sys.argv, target, args, kwargs)
                )
                print(f"COPY-PASTE-THIS: gdb {sys.argv[0]} -p {process.pid}", flush=True)
            process.join(timeout=timeout)
            if timeout >= 100:
                print("\nprocess.pid JOINED", process.pid, flush=True)
            t_delta = time.time() - t_start
            if process.exitcode == 66 and m.defined_THREAD_SANITIZER:  # Issue #2754
                # WOULD-BE-NICE-TO-HAVE: Check that the message below is actually in the output.
                # Maybe this could work:
                # https://gist.github.com/alexeygrigorev/01ce847f2e721b513b42ea4a6c96905e
                pytest.skip(
                    "ThreadSanitizer: starting new threads after multi-threaded fork is not supported."
                )
            elif test_fn is _intentional_deadlock:
>               assert process.exitcode is None
E               AssertionError: assert 0 is None
E                +  where 0 = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>.exitcode

args       = ()
kwargs     = {}
process    = <Process name='Process-72' pid=5355 parent=5115 stopped exitcode=0 daemon>
t_delta    = 0.10827970504760742
t_start    = 1747718249.89369
target     = <function _intentional_deadlock at 0x2898f7a4740>
test_fn    = <function _intentional_deadlock at 0x2898f7a4740>
timeout    = 0.1

test_gil_scoped.py:187: AssertionError
=============================== warnings summary ===============================
<frozen importlib._bootstrap>:491
  <frozen importlib._bootstrap>:491: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'exo_planet_c_api', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

I haven't been able to reproduce the flakes locally, include with pytest-run-parallel, pytest-repeat, and reducing and increasing sys.setswitchinterval().

@henryiii
Copy link
Collaborator Author

henryiii commented May 22, 2025

These tests are specifically testing the GIL in a version of Python without a GIL. I expect we should either skip these tests or also add something like Py_CRITICAL_SECTION (or mutex) somewhere (maybe we should provide an API for that?)

@rwgk
Copy link
Collaborator

rwgk commented May 25, 2025

Flake observed under #5688:

🐍 3.14t • macos-14 • x64

=================================== FAILURES ===================================
______________ test_run_in_process_direct[_intentional_deadlock] _______________

These tests are specifically testing the GIL in a version of Python without a GIL. I expect we should either skip these tests or also add something like Py_CRITICAL_SECTION (or mutex) somewhere (maybe we should provide an API for that?)

I we believe the tests need changes for free-threading, but nobody has time to work on it, skipping them seems like the best approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants