Skip to content

Reduce the overhead of tracing, profiling, and quickening checks for calls #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

lpereira
Copy link

@lpereira lpereira commented Dec 1, 2021

This is my first attempt at implementing this idea. This modifies the compiler to add a new START_FUNCTION instruction right before visiting all the function body nodes, and then modifies the interpreter to move the tracing/profiling/quickening checks right after the start_frame label to the execution of this opcode. When quickening, this instruction is transformed into a NOP instruction, effectively removing. these checks.

Code compiles, runs, and I can see the START_FUNCTION opcode added on functions with dis, but I don't know yet how well this is working; especially the quickening step. Still need a bit more time to learn how to inspect these things.

I know for a fact that there are some things that I didn't consider yet, and will continue to look at this -- I'm opening the draft PR here just to make it easier to review the code and know if I'm in the right direction, as I'm new to this code base. One of the things these changes are missing are uses of the start_frame by things other than functions; I don't understand enough of the machinery there yet to fix this but this is just the question of spending some time following the code.

@lpereira
Copy link
Author

lpereira commented Dec 1, 2021

Pulling in @gvanrossum, who asked me to create this PR to review.

@lpereira lpereira force-pushed the start-function-for-faster-cpython branch 2 times, most recently from f684de5 to 1419a87 Compare December 1, 2021 23:06
@lpereira lpereira force-pushed the start-function-for-faster-cpython branch from 1419a87 to 669edfa Compare December 1, 2021 23:24
Copy link

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must be missing why you're adding a second RETURN code.

@lpereira lpereira force-pushed the start-function-for-faster-cpython branch from 669edfa to a22fab8 Compare December 1, 2021 23:36
Comment on lines +2502 to +2505
/* FIXME(lpereira): The stack pointer and f_state will be set to the same
* value here, and when jumping to return_value_without_tracing above. This
* is redundant, but saves us from copying some code. */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would just duplicate the code. You also redundantly have an extra variable retval, which is only used in the TRACE_FUNCTION_EXIT() macro. Maybe something needs to be refactored again here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some refactoring and the variable that's only used when tracing is enabled is now gone. (It's now inside the macro, which now takes a parameter.)

I dislike both copying code and doing what has been done in this PR, to be honest... while I do have a slight preference to the way I've done things as it's quite a bit of code that don't need to be copied around.

Copy link
Member

@brandtbucher brandtbucher Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible option: split the opcode in two? So we have one opcode which does the tracing, and one which does the return.

The original code would be EXIT_FUNCTION + RETURN_VALUE, and the quickening step would replace EXIT_FUNCTION with NOP (or, even better, RETURN_VALUE).

I'm pretty sure that could work correctly here, but I might be missing some edge case where an eval break could leave us in a weird state where we trace a return that never happens (or something).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, never mind. I don't know how much that would actually clean this up.

L. Pereira added 2 commits December 6, 2021 10:27
No change in generated code, other than this is a tiny bit easier to read in
the rare cases you need to.
No need to pass an "is_async" parameter if we can look at the statement
we're generating code for.
@lpereira lpereira force-pushed the start-function-for-faster-cpython branch from a22fab8 to 65eadae Compare December 6, 2021 18:35
L. Pereira added 4 commits December 6, 2021 10:47
This new opcode will check for tracing/profiling and check if the
function needs to be quickened.  This change mostly prepares the
terrain by emitting a START_FUNCTION opcode at every function prologue,
that currently does as much as a NOP.
Introduce a RETURN_VALUE_QUICK opcode that replaces RETURN_VALUE on
quickened functions.  The implementation for this new opcode is
exactly the same as the actual RETURN_VALUE opcode, with the exception
that it doesn't expand the {DTRACE,TRACE}_FUNCTION_EXIT() functions.
@lpereira lpereira force-pushed the start-function-for-faster-cpython branch from 65eadae to 6e1698e Compare December 6, 2021 18:49
@lpereira
Copy link
Author

lpereira commented Dec 6, 2021

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

@gvanrossum
Copy link

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

I tweaked some permissions. Can you do this now?

@lpereira lpereira requested a review from markshannon December 6, 2021 21:57
@lpereira
Copy link
Author

lpereira commented Dec 6, 2021

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

I tweaked some permissions. Can you do this now?

Yes, just added Mark as a reviewer. Thanks.

@markshannon
Copy link
Member

Any more progress on this?

@lpereira
Copy link
Author

lpereira commented Dec 9, 2021

Any more progress on this?

I have been looking at other things the last day or so, so no. (I don't know if you have had the time to take a look at the PR.)

@markshannon
Copy link
Member

What do you want me to look at?
I don't see much point in reviewing the code until it passes the tests.

@markshannon
Copy link
Member

I think this is going to be difficult without breaking open YIELD_FROM first. I'm looking into doing that now.

@markshannon
Copy link
Member

See python#30035

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Jan 13, 2022
@markshannon
Copy link
Member

Obsolete

@markshannon markshannon closed this Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants