Skip to content

bpo-44525: Copy freevars in bytecode to allow calls to inner functions to be specialized #29595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 23, 2021

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Nov 17, 2021

This should speedup calls to functions, by allowing us to specialize calls to functions with free variables without penalizing calls to other functions.

  • Calls to functions without freevars might be a tiny bit quicker as we don't need to check for freevars when initializing the locals.
  • Calls to functions with freevars should be faster, as they can now be specialized.

However, there is a bit of yak-shaving involved:

  1. It is necessary to keep a reference to the function in the frame. This is actually cheaper than before as it we now only need borrowed references to the builtins and globals, saving a few loads and stores.
  2. In order to do 1, we need to create a new function for eval and friends, rather than just passing a FrameDescriptor.

Although the speedup is probably not significant this is worthwhile as it makes more operations explicit in the bytecode which what we want when optimizing.

https://bugs.python.org/issue44525

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 18, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 4fbaa99 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 18, 2021
Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM.

I'm pretty sure there are a few slight (and very unlikely) races here. I also have a couple questions about the purpose of f_globals and f_builtins on the frame.

Comment on lines +24 to +25
PyObject *f_globals; /* Borrowed reference */
PyObject *f_builtins; /* Borrowed reference */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we avoid extra refcount operations for these, since f_func is guaranteed to hold a ref, right? Cool!

The only problem I see is that the function's func_globals (or func_builtins) could get swapped out in another thread while this frame is executing. Then we'd lose that borrowed reference and potentially segfault. It's an unlikely case, but still a possibility.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having these on struct _interpreter_frame is helpful because we avoid extra pointer indirection, right?

However, that only matters in cases where these are used relatively frequently (so the cost of the indirection adds up), i.e. in the eval loop (via GLOBALS()). If we do not expect frame->f_globals to change during the eval loop then could we move both to local variables before starting the loop (and drop them from struct _interpreter_frame altogether)?

Note that then we would have a different possible race where the function's func_globals could get swapped out while the frame is running, leading to weirdness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why don't we similarly add f_closure to the frame? Is it because we want to avoid increasing the size of struct _interpreter_frame (and we don't actually use the closure that much, so the cost of the indirection is bearable)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why don't we similarly add f_closure to the frame? Is it because we want to avoid increasing the size of struct _interpreter_frame (and we don't actually use the closure that much, so the cost of the indirection is bearable)?

We might want to avoid creating a closure object at all. We could put the cells at the end of the function object, rather than in a separate tuple. but that's future work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you skipped over the two other comments here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The frame has a strong reference to the function, which has a strong reference to both the globals and builtins.
So borrowed references to the globals and builtins are OK.

If we do not expect frame->f_globals to change during the eval loop then could we move both to local variables before starting the loop (and drop them from struct _interpreter_frame altogether)?

And where would you store those references across calls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep f_func on the frame then that's where we'd get them.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 19, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 51f2e49 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 19, 2021
Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@markshannon markshannon merged commit 135cabd into python:main Nov 23, 2021
@markshannon markshannon deleted the copy-free-vars-in-bytecode branch November 23, 2021 10:12
thatbirdguythatuknownot added a commit to thatbirdguythatuknownot/cpython that referenced this pull request Nov 23, 2021
remykarem pushed a commit to remykarem/cpython that referenced this pull request Dec 7, 2021
…nctions to be specialized (pythonGH-29595)

* Make internal APIs that take PyFrameConstructor take a PyFunctionObject instead.

* Add reference to function to frame, borrow references to builtins and globals.

* Add COPY_FREE_VARS instruction to allow specialization of calls to inner functions.
mdickinson added a commit to mdickinson/refcycle that referenced this pull request Oct 7, 2023
…er (#87)

This PR skips frame annotation completeness tests for Python >= 3.11.
Python 3.11 added a reference to the function from the frame (see
python/cpython#29595), but didn't expose that reference at Python level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants