-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
gh-120321: Make gi_frame_state transitions atomic in FT build #142599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This makes generator frame state transitions atomic in the free threading build, which avoids segfaults when trying to execute a generator from multiple threads concurrently. There are still a few operations that aren't thread-safe and may crash if performed concurrently on the same generator/coroutine: * Accessing gi_yieldfrom/cr_await/ag_await * Accessing gi_frame/cr_frame/ag_frame * Async generator operations
This comment was marked as outdated.
This comment was marked as outdated.
Python/bytecodes.c
Outdated
| FT_ATOMIC_STORE_INT8_RELEASE(gen->gi_frame_state, FRAME_SUSPENDED + oparg); | ||
| #ifdef Py_GIL_DISABLED | ||
| ((_PyThreadStateImpl *)tstate)->gen_last_frame_state = FRAME_SUSPENDED + oparg; | ||
| #endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to factor this out into a helper function? It's duplicated in ceval.c. Also, having helper to update both would reduce the chance that we update one without the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored it out. I also moved the functions to ceval_macros.h from pycore_genobject.h because:
- they are only used in ceval.c and bytecodes.c
- I want to avoid adding additional includes to
pycore_genobject.hbecause I think it's pulled in by internal headers that some third-party projects tend to include.
Objects/genobject.c
Outdated
|
|
||
| gen->gi_frame_state = FRAME_EXECUTING; | ||
| #if defined(Py_GIL_DISABLED) && defined(Py_DEBUG) | ||
| ((_PyThreadStateImpl *)tstate)->gen_last_frame_state = FRAME_EXECUTING; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this set only in debug builds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to #if defined(Py_GIL_DISABLED). It's not really needed at all, but it'll help catch bugs in case some ceval code forgets to set gen_last_frame_state before returning.
In other words, when _PyEval_EvalFrame returns, gen_last_frame_state must be either FRAME_SUSPENDED[_YIELD_FROM] or FRAME_CLEARED and this helps differentiate this call to _PyEval_EvalFrame from earlier calls in the same thread.
Co-authored-by: Kumar Aditya <[email protected]>
Co-authored-by: Kumar Aditya <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to bytecodes.c and ceval/cevalmacros look good.
I'm puzzled by the gen_last_frame_state field on the thread state.
It seems unnecessary and, if it is necessary, broken in case of reentrancy.
Objects/genobject.c
Outdated
| // Grab the last frame state from the thread state instead of the | ||
| // generator, as it may have changed if another thread resumed this | ||
| // generator. | ||
| int8_t frame_state = ((_PyThreadStateImpl *)tstate)->gen_last_frame_state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can another thread resume this generator? This thread is executing it, so no other thread can run it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thread is no longer executing the generator once it sets the frame state to suspended in YIELD_VALUE. Between that line in bytecodes.c and here, another thread can immediately start executing the same generator. That isn't true with the GIL because there are no opportunities for reentrancy or GIL release between YIELD_VALUE and the return from _PyEval_Frame. That lack of opportunities for reentrancy is also why it's safe to store it on the thread state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see what this issue is.
What you want is an additional return value, to distinguish between a yield and a return.
I think the code would be clearer, if gen_last_frame_state field were renamed to something like generator_return_kind. Set it to RETURN before the call, and set it to YIELD in YIELD_VALUE.
Instead of testing for FRAME_STATE_SUSPENDED(frame_state) test for generator_return_kind != RETURN
Also add a comment about why it is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed the field and adjusted the condition.
However, we need to set gen_last_frame_state in gen_clear_frame (or elsewhere in RETURN_VALUE). It's not enough to set it to RETURN before the _PyEval_EvalFrame call because generators can send to other generator. The "inner" generator will overwrite the gen_last_frame_state. (See the test failures on 4b82b8c)
Objects/genobject.c
Outdated
| FRAME_STATE_SUSPENDED(frame_state)); | ||
|
|
||
| if (!_Py_GEN_SET_FRAME_STATE(gen, frame_state, FRAME_EXECUTING)) { | ||
| goto retry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for a goto. You can use do ... while (!_Py_GEN_SET_FRAME_STATE(gen, frame_state, FRAME_EXECUTING));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched from gotos to do/while loops
Also use it in both the default and FT builds.
Generator calls may be nested so it's not sufficient to set the field before the call to PyEval_Frame.
|
The changes look good. I'm concerned about the complexity of this for such an obscure use case, but I can't suggest a better way of doing it. |
mpage
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Co-authored-by: Kumar Aditya <[email protected]>
| if (!_Py_GEN_TRY_SET_FRAME_STATE(gen, frame_state, FRAME_CLEARED)) { | ||
| continue; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check seems to trigger a new compiler warning during build.
Objects/genobject.c:426:17: warning: code will never be executed [-Wunreachable-code]
continue;
Tested on MacOS with gcc.
This makes generator frame state transitions atomic in the free threading build, which avoids segfaults when trying to execute a generator from multiple threads concurrently.
There are still a few operations that aren't thread-safe and may crash if performed concurrently on the same generator/coroutine: