Skip to content

[3.13.0rc1][greenlet] _PyFrame_GetCode: Assertion `PyCode_Check(f->f_executable)' failed (regression from 3.13.0b4) #122772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sergio-correia opened this issue Aug 7, 2024 · 15 comments
Labels
type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@sergio-correia
Copy link

sergio-correia commented Aug 7, 2024

Crash report

What happened?

File test_oops.py (based on a test from greenlet project that was observed to crash)

import unittest
from contextvars import copy_context

from greenlet import getcurrent, greenlet

class Oops_test(unittest.TestCase):
    def test_oops(self):
        let = greenlet(copy_context().run)
        let.switch(getcurrent().switch)

Run the above code with python3 -m unittest test_oops:

python3 -m unittest test_oops
Segmentation fault (core dumped)

Just tested with the current top of the 3.13 branch (5c161cb8329c941aa219dc34c56afa368516d6fb):

python -VV
Python 3.13.0rc1+ (heads/3.13-dirty:5c161cb832, Aug  7 2024, 05:59:15) [GCC 14.1.1 20240701 (Red Hat 14.1.1-7)]
python3 -m unittest test_oops
python3: ./Include/internal/pycore_frame.h:81: _PyFrame_GetCode: Assertion `PyCode_Check(f->f_executable)' failed.
Aborted (core dumped)

This is an intermittent crash, so it may not crash every single time, but if you run it in a loop, e.g. 10 times, it will likely crash in some of those.

This appears to be a regression observed from 3.13.0b4, and a bisect between 3.13.0b4 and 3.13.0rc1 identified 233ed46e6d2ba1d1f7d83a72ccf9aad9e628ede1 as the first bad commit (please double check):

commit 233ed46e6d2ba1d1f7d83a72ccf9aad9e628ede1
Author: Miss Islington (bot) <[email protected]>
Date:   Thu Jul 18 17:38:28 2024 +0200

    [3.13] gh-118934: Make PyEval_GetLocals return borrowed reference (GH-119769) (#121869)

    gh-118934: Make PyEval_GetLocals return borrowed reference (GH-119769)
    (cherry picked from commit e65cb4c6f01a687f451ad9db1600525e1c5832c4)

    Co-authored-by: Tian Gao <[email protected]>
    Co-authored-by: Alyssa Coghlan <[email protected]>

 Include/internal/pycore_frame.h                    |  4 +++
 Lib/test/test_sys.py                               |  2 +-
 .../2024-05-30-04-11-36.gh-issue-118934.fbDqve.rst |  1 +
 Objects/frameobject.c                              |  4 +++
 Python/ceval.c                                     | 33 +++++++++++++++++++++-
 5 files changed, 42 insertions(+), 2 deletions(-)

Please let me know if I can provide any additional info.

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.0rc1 (main, Aug 1 2024, 00:00:00) [GCC 14.1.1 20240701 (Red Hat 14.1.1-7)]

@sergio-correia sergio-correia added the type-crash A hard crash of the interpreter, possibly with a core dump label Aug 7, 2024
@sobolevn
Copy link
Member

sobolevn commented Aug 7, 2024

Can you please try to reproduce it without using greenlet lib? This would really help!

@Eclips4
Copy link
Member

Eclips4 commented Aug 7, 2024

Is there any chance that #122735 has also fixed that?

@sergio-correia
Copy link
Author

Is there any chance that #122735 has also fixed that?

Thanks for the suggestion, but It did not help this specific case, unfortunately; as you can see in the report, I also tried with the then-current top of the 3.13 branch, which was 5c161cb8329c941aa219dc34c56afa368516d6fb, i.e. the backport from the fix from #122735.

@sergio-correia
Copy link
Author

Can you please try to reproduce it without using greenlet lib? This would really help!

If I manage to do so, I will update the issue here, but so far I have only encountered the issue with the greenlet lib, from their test suite we run as part of some CI testing.

@sergio-correia
Copy link
Author

If I remove the Py_CLEAR(f->f_locals_cache) from frame_dealloc() [Objects/frameobject.c], the crash is gone, but I am guessing that likely means some leak?

@ZeroIntensity
Copy link
Member

If I remove the Py_CLEAR(f->f_locals_cache) from frame_dealloc() [Objects/frameobject.c], the crash is gone, but I am guessing that likely means some leak?

Yeah, that leaks the local variables from every frame. Although, considering that's the "fix," could this be a reference counting problem from Greenlet's end?

@sergio-correia
Copy link
Author

If I remove the Py_CLEAR(f->f_locals_cache) from frame_dealloc() [Objects/frameobject.c], the crash is gone, but I am guessing that likely means some leak?

Yeah, that leaks the local variables from every frame. Although, considering that's the "fix," could this be a reference counting problem from Greenlet's end?

Yeah, that would be a good candidate for the issue here. I will try to do some investigation on that front.

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Aug 8, 2024

If I remove the Py_CLEAR(f->f_locals_cache) from frame_dealloc() [Objects/frameobject.c], the crash is gone, but I am guessing that likely means some leak?

Yeah, that leaks the local variables from every frame. Although, considering that's the "fix," could this be a reference counting problem from Greenlet's end?

Yeah, that would be a good candidate fo> > > If I remove the Py_CLEAR(f->f_locals_cache) from frame_dealloc() [Objects/frameobject.c], the crash is gone, but I am guessing that likely means some leak?

Yeah, that leaks the local variables from every frame. Although, considering that's the "fix," could this be a reference counting problem from Greenlet's end?

Yeah, that would be a good candidate for the issue here. I will try to do some investigation on that front.

I'll be happy to take a look too, if you can find what (C) function in Greenlet is segfaulting.

@ZeroIntensity
Copy link
Member

After installing a 3.13 port of greenlet from this PR, I wasn't able to produce a segfault from your reproducer. Does this only happen in CI?

@ZeroIntensity
Copy link
Member

Ah, Valgrind did pick something up. I'm guessing the bug is here, PyObject_GetArenaAllocator is not called in all cases, leaving alloc uninitialized. This doesn't look like a CPython issue, though.

@gaogaotiantian
Copy link
Member

Is there a backtrace at the crash site? The PR was meant to convert the behavior of PyEval_GetLocals() to 3.12. f->f_locals_cache is only used when PyEval_GetLocals() is called. Could you check the call site of f->f_locals_cache and confirm you are using it as a borrowed reference? aka the code always new ref the return value of PyEval_GetLocals().

@vstinner
Copy link
Member

vstinner commented Aug 9, 2024

I failed to reproduce the issue on 3.13 and main branches.

How did you configure and build Python?

@sergio-correia
Copy link
Author

I failed to reproduce the issue on 3.13 and main branches.

How did you configure and build Python?

I just used what was available in Fedora rawhide at the time. They have updated the 3.13rc1 build to have the fix in #122735, but you can still reproduce the issue e.g. if you use a rawhide container for quick testing -- you can get the image here, if you wish to test it: registry.fedoraproject.org/fedora:rawhide

As @ZeroIntensity mentioned yesterday (and you just did also) that he was unable to reproduce it with the linked greenlet PR (which seems to be from you; and the Fedora build includes it already) [1], I ended up rebuilding greenlet for testing and then the issue was gone... I opened this downstream issue earlier today: https://bugzilla.redhat.com/show_bug.cgi?id=2303832

[1] https://src.fedoraproject.org/rpms/python-greenlet/blob/rawhide/f/python-greenlet.spec#_24

@vstinner vstinner changed the title [3.13.0rc1] _PyFrame_GetCode: Assertion `PyCode_Check(f->f_executable)' failed (regression from 3.13.0b4) [3.13.0rc1][greenlet] _PyFrame_GetCode: Assertion `PyCode_Check(f->f_executable)' failed (regression from 3.13.0b4) Aug 30, 2024
@vstinner
Copy link
Member

I ended up rebuilding greenlet for testing and then the issue was gone.

Can you still reproduce the issue or not? If not, I suggest to close the issue.

@sergio-correia
Copy link
Author

I ended up rebuilding greenlet for testing and then the issue was gone.

Can you still reproduce the issue or not? If not, I suggest to close the issue.

I cannot. It seems it was "fixed" by a rebuild. I am closing this. Thanks everyone for the help.

@Eclips4 Eclips4 closed this as not planned Won't fix, can't repro, duplicate, stale Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

6 participants