Reduce the overhead of tracing, profiling, and quickening checks for calls #8

lpereira · 2021-12-01T21:45:41Z

This is my first attempt at implementing this idea. This modifies the compiler to add a new START_FUNCTION instruction right before visiting all the function body nodes, and then modifies the interpreter to move the tracing/profiling/quickening checks right after the start_frame label to the execution of this opcode. When quickening, this instruction is transformed into a NOP instruction, effectively removing. these checks.

Code compiles, runs, and I can see the START_FUNCTION opcode added on functions with dis, but I don't know yet how well this is working; especially the quickening step. Still need a bit more time to learn how to inspect these things.

I know for a fact that there are some things that I didn't consider yet, and will continue to look at this -- I'm opening the draft PR here just to make it easier to review the code and know if I'm in the right direction, as I'm new to this code base. One of the things these changes are missing are uses of the start_frame by things other than functions; I don't understand enough of the machinery there yet to fix this but this is just the question of spending some time following the code.

lpereira · 2021-12-01T21:46:26Z

Pulling in @gvanrossum, who asked me to create this PR to review.

Include/internal/pycore_ast.h

gvanrossum

I must be missing why you're adding a second RETURN code.

Python/ceval.c

gvanrossum · 2021-12-02T04:14:18Z

Python/ceval.c

+            /* FIXME(lpereira): The stack pointer and f_state will be set to the same
+             * value here, and when jumping to return_value_without_tracing above. This
+             * is redundant, but saves us from copying some code. */


Honestly, I would just duplicate the code. You also redundantly have an extra variable retval, which is only used in the TRACE_FUNCTION_EXIT() macro. Maybe something needs to be refactored again here.

I did some refactoring and the variable that's only used when tracing is enabled is now gone. (It's now inside the macro, which now takes a parameter.)

I dislike both copying code and doing what has been done in this PR, to be honest... while I do have a slight preference to the way I've done things as it's quite a bit of code that don't need to be copied around.

Another possible option: split the opcode in two? So we have one opcode which does the tracing, and one which does the return.

The original code would be EXIT_FUNCTION + RETURN_VALUE, and the quickening step would replace EXIT_FUNCTION with NOP (or, even better, RETURN_VALUE).

I'm pretty sure that could work correctly here, but I might be missing some edge case where an eval break could leave us in a weird state where we trace a return that never happens (or something).

Hm, never mind. I don't know how much that would actually clean this up.

No change in generated code, other than this is a tiny bit easier to read in the rare cases you need to.

No need to pass an "is_async" parameter if we can look at the statement we're generating code for.

This new opcode will check for tracing/profiling and check if the function needs to be quickened. This change mostly prepares the terrain by emitting a START_FUNCTION opcode at every function prologue, that currently does as much as a NOP.

…tion

Introduce a RETURN_VALUE_QUICK opcode that replaces RETURN_VALUE on quickened functions. The implementation for this new opcode is exactly the same as the actual RETURN_VALUE opcode, with the exception that it doesn't expand the {DTRACE,TRACE}_FUNCTION_EXIT() functions.

lpereira · 2021-12-06T19:46:38Z

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

gvanrossum · 2021-12-06T20:38:40Z

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

I tweaked some permissions. Can you do this now?

lpereira · 2021-12-06T21:57:43Z

(I can't seem to add anybody else to review this, but @markshannon asked to be tagged here.)

I tweaked some permissions. Can you do this now?

Yes, just added Mark as a reviewer. Thanks.

markshannon · 2021-12-09T16:51:38Z

Any more progress on this?

lpereira · 2021-12-09T17:25:32Z

Any more progress on this?

I have been looking at other things the last day or so, so no. (I don't know if you have had the time to take a look at the PR.)

markshannon · 2021-12-10T13:14:14Z

What do you want me to look at?
I don't see much point in reviewing the code until it passes the tests.

markshannon · 2021-12-10T16:40:53Z

I think this is going to be difficult without breaking open YIELD_FROM first. I'm looking into doing that now.

markshannon · 2021-12-10T17:53:28Z

See python#30035

github-actions · 2022-01-13T01:07:29Z

This PR is stale because it has been open for 30 days with no activity.

markshannon · 2023-08-07T14:33:43Z

Obsolete

lpereira force-pushed the start-function-for-faster-cpython branch 2 times, most recently from f684de5 to 1419a87 Compare December 1, 2021 23:06

gvanrossum reviewed Dec 1, 2021

View reviewed changes

Include/internal/pycore_ast.h Outdated Show resolved Hide resolved

lpereira force-pushed the start-function-for-faster-cpython branch from 1419a87 to 669edfa Compare December 1, 2021 23:24

gvanrossum reviewed Dec 1, 2021

View reviewed changes

Python/ceval.c Outdated Show resolved Hide resolved

lpereira force-pushed the start-function-for-faster-cpython branch from 669edfa to a22fab8 Compare December 1, 2021 23:36

gvanrossum reviewed Dec 2, 2021

View reviewed changes

L. Pereira added 2 commits December 6, 2021 10:27

Build "opcode_targets" with designated initializers

908222e

No change in generated code, other than this is a tiny bit easier to read in the rare cases you need to.

Simplify calls to compiler_function()

9e84c0f

No need to pass an "is_async" parameter if we can look at the statement we're generating code for.

lpereira force-pushed the start-function-for-faster-cpython branch from a22fab8 to 65eadae Compare December 6, 2021 18:35

L. Pereira added 4 commits December 6, 2021 10:47

compile: Introduce START_FUNCTION opcode

323c03d

This new opcode will check for tracing/profiling and check if the function needs to be quickened. This change mostly prepares the terrain by emitting a START_FUNCTION opcode at every function prologue, that currently does as much as a NOP.

ceval: Move checks for tracing/profiling to START_FUNCTION implementa…

9313c41

…tion

specialize: Quicken START_FUNCTION to NOP

2eccb5f

lpereira force-pushed the start-function-for-faster-cpython branch from 65eadae to 6e1698e Compare December 6, 2021 18:49

lpereira requested a review from markshannon December 6, 2021 21:57

github-actions bot added the stale label Jan 13, 2022

markshannon closed this Aug 7, 2023

Reduce the overhead of tracing, profiling, and quickening checks for calls #8

Reduce the overhead of tracing, profiling, and quickening checks for calls #8

Conversation

lpereira commented Dec 1, 2021

Uh oh!

lpereira commented Dec 1, 2021

Uh oh!

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gvanrossum Dec 2, 2021

Choose a reason for hiding this comment

Uh oh!

lpereira Dec 6, 2021

Choose a reason for hiding this comment

Uh oh!

brandtbucher Dec 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brandtbucher Dec 13, 2021

Choose a reason for hiding this comment

Uh oh!

lpereira commented Dec 6, 2021

Uh oh!

gvanrossum commented Dec 6, 2021

Uh oh!

lpereira commented Dec 6, 2021

Uh oh!

markshannon commented Dec 9, 2021

Uh oh!

lpereira commented Dec 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markshannon commented Dec 10, 2021

Uh oh!

markshannon commented Dec 10, 2021

Uh oh!

markshannon commented Dec 10, 2021

Uh oh!

github-actions bot commented Jan 13, 2022

Uh oh!

markshannon commented Aug 7, 2023

Uh oh!

Uh oh!

brandtbucher Dec 13, 2021 •

edited

Loading

lpereira commented Dec 9, 2021 •

edited

Loading