gh-115758: Optimizer constant propagation for 64-bit ints and doubles #117396

Fidget-Spinner · 2024-03-30T23:34:28Z

Issue: Constant propagation for _BINARY_OP #115758

Fidget-Spinner · 2024-03-31T00:40:50Z

For the following code:


def bench_constants():
    for _ in range(10000000):
        x = 1.0
        y = x + x + x + x + x + x + x + x + x + x + x
    for _ in range(10000000):
        x = 1
        y = x + x + x + x + x + x + x + x + x + x + x

if __name__ == '__main__':
    bench_constants()

Main with uops: 2.726s
This branch with uops: 861.7ms
The microbenchmark takes less than half the time on this branch than the original.

Fidget-Spinner · 2024-03-31T00:41:27Z

Lib/test/test_capi/test_opt.py

@@ -792,48 +792,44 @@ def testfunc(n):

    def test_float_add_constant_propagation(self):
        def testfunc(n):
-            a = 1.0


This is a limitation of loop-based traces. We should start projecting traces from function entries as well, but I'll leave that for Mark or someone else to do in the future.

Fidget-Spinner · 2024-04-01T14:54:16Z

@brandtbucher Windows builds are complaining there's no PyAPI_DATA for memcpy. But memcpy is a C stdlib function. I know stencils are built without stdlib, but is there some way around this? Or do I just expose memcpy as an extern?

brandtbucher · 2024-04-01T15:07:01Z

@brandtbucher Windows builds are complaining there's no PyAPI_DATA for memcpy. But memcpy is a C stdlib function. I know stencils are built without stdlib, but is there some way around this? Or do I just expose memcpy as an extern?

Yeah, sorry. You're gonna need to get rid of memcpy in the jitted code, either by rewriting the copy as a loop or making a call to a C API function that wraps memcpy.

The problem is that the C compiler wants to assume that memcpy is nearby, which isn't necessarily the case. So we have to add a layer of indirection to turn the 32-bit jump into a 64-bit one.

gvanrossum · 2024-04-01T15:10:15Z

The problem is that the C compiler wants to assume that memcpy is nearby, which isn't necessarily the case. So we have to add a layer of indirection to turn the 32-bit jump into a 64-bit one.

TIL. :-) What is it that makes it work for C API functions? Are they always nearby, or do we always do an extra indirection?

And why would the C compiler "know" that memcpy is nearby?

brandtbucher · 2024-04-01T15:21:37Z

TIL. :-) What is it that makes it work for C API functions? Are they always nearby, or do we always do an extra indirection?

We always do the indirection. If it turns out that they are in range, we rewrite the jump to a more efficient one at runtime.

And why would the C compiler "know" that memcpy is nearby?

Because the default "small" code model assumes that all code lives within 31 bits of each other. We want this code model so the (common) jumps between instructions are efficent, at the cost of a layer of indirection between the (ideally less common) function calls.

So basically, all function calls have to be extern (or, for Windows, __declspec(dllimport)). We could change this by jitting trampolines for the out-of-range functions at some point, but that's just trading one type of indirection for another (and adds quite a bit of complexity). We already do this for AArch64 Linux out of neccessity... and it ain't pretty.

Fidget-Spinner · 2024-04-01T17:00:02Z

@brandtbucher I'm getting assertion failure on the JIT debug build of 32-bit Win. Specifically the failing assertion is

                // 32-bit absolute address.
                // Check that we're not out of range of 32 unsigned bits:
                assert(value < (1ULL << 32));

from Jit.c (at least, this is all I can decode anyways because the error logs has special broken characters.

Also failing WASI, but that seems to be a floating point issue.

brandtbucher · 2024-04-01T17:01:21Z

Hm, okay, I'll try to dig into that a bit today. Sorry for the friction!

brandtbucher · 2024-04-01T17:03:27Z

It looks like non-JIT 32-bit Windows is failing too, though. Is it because you're trying to cram a 64-bit double into a 32-bit pointer?

brandtbucher · 2024-04-01T17:04:01Z

WASI is 32 bits also. I think that's the issue.

Fidget-Spinner · 2024-04-01T17:06:14Z

It looks like non-JIT 32-bit Windows is failing too, though. Is it because you're trying to cram a 64-bit double into a 32-bit pointer?

WAIT I forgot it casts to 32-bit pointer. Ahh thanks so much for that! So sorry for the noise!

This reverts commit 6de675c.

brandtbucher · 2024-04-01T17:27:18Z

FWIW, I think 32-bit JIT builds only have a 32-bit operand for uops. This is by accident, not by design, but I just don't think we've noticed since we never stick anything wider than a pointer in there.

Fidget-Spinner · 2024-04-01T17:35:49Z

Ok WASI now passes. So there is indeed a bug in 32-bit JIT builds.

markshannon · 2024-04-18T08:40:34Z

The stats show a very small reduction in the number of _BINARY_OP_ADD_INTs and the equivalent number of _LOAD_INT added. Given that _LOAD_INT still needs to create a new int and we still need to destroy the left operand and most of the right operands, I would expect the performance impact to be unmeasurably small.

An optimization like this could make sense when combined with another to pass to more effectively eliminated dead values, but I'd be surprised if that were worth the effort. At least until the tier 2 optimization is much more capable.

brandtbucher · 2024-10-04T23:39:34Z

Is this still something we want to do right now, or should it wait for the partial evaluation pass?

Fidget-Spinner · 2024-10-05T06:39:10Z

I have a feeling we might want a pre-processing step in the specializer/eedundancy eliminator pass first before passing it to the partial evaluator. But either ways works. So lets wait and see.

Fidget-Spinner · 2025-01-16T13:51:59Z

Doing this in the PE instead.

Optimizer constant propagation for 64-bit ints

b993c48

bedevere-app bot mentioned this pull request Mar 30, 2024

Constant propagation for _BINARY_OP #115758

Open

Fidget-Spinner added the skip news label Mar 30, 2024

Fidget-Spinner marked this pull request as ready for review March 30, 2024 23:34

Fidget-Spinner requested review from markshannon and gvanrossum as code owners March 30, 2024 23:34

bedevere-app bot added the awaiting core review label Mar 30, 2024

Fidget-Spinner mentioned this pull request Mar 30, 2024

Tier 2 optimizer: Constant Int and Float propagation, without the refcount problems faster-cpython/ideas#670

Open

Add double constant propagation

97c3c04

Fidget-Spinner changed the title ~~gh-115758: Optimizer constant propagation for 64-bit ints~~ gh-115758: Optimizer constant propagation for 64-bit ints and doubles Mar 31, 2024

Fidget-Spinner commented Mar 31, 2024

View reviewed changes

include string.h

ffea734

Fidget-Spinner requested a review from brandtbucher as a code owner April 1, 2024 13:28

minor fixups

a01c2f1

Fidget-Spinner added 4 commits April 2, 2024 00:00

fix windows build

c5b3d54

move to better name and place

c84dd91

fix check

dc7c0d9

more tests

8b10b50

alternative forms of black magic

6de675c

Revert "alternative forms of black magic"

e66c0dd

This reverts commit 6de675c.

fix 32-bit pointer issues

ab0836a

brandtbucher mentioned this pull request Apr 3, 2024

GH-117512: Allow 64-bit JIT operands on 32-bit platforms #117527

Merged

Merge remote-tracking branch 'upstream/main' into int_constant_propagate

577e8e3

Fidget-Spinner closed this Jan 16, 2025

Uh oh!

gh-115758: Optimizer constant propagation for 64-bit ints and doubles #117396

gh-115758: Optimizer constant propagation for 64-bit ints and doubles #117396

Uh oh!

Conversation

Fidget-Spinner commented Mar 30, 2024 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Mar 31, 2024

Uh oh!

Fidget-Spinner Mar 31, 2024

Choose a reason for hiding this comment

Uh oh!

Fidget-Spinner commented Apr 1, 2024

Uh oh!

brandtbucher commented Apr 1, 2024

Uh oh!

gvanrossum commented Apr 1, 2024

Uh oh!

brandtbucher commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandtbucher commented Apr 1, 2024

Uh oh!

brandtbucher commented Apr 1, 2024

Uh oh!

brandtbucher commented Apr 1, 2024

Uh oh!

Fidget-Spinner commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandtbucher commented Apr 1, 2024

Uh oh!

Fidget-Spinner commented Apr 1, 2024

Uh oh!

markshannon commented Apr 18, 2024

Uh oh!

brandtbucher commented Oct 4, 2024

Uh oh!

Fidget-Spinner commented Oct 5, 2024

Uh oh!

Fidget-Spinner commented Jan 16, 2025

Uh oh!

Uh oh!

Fidget-Spinner commented Mar 30, 2024 •

edited by bedevere-app bot

Loading

brandtbucher commented Apr 1, 2024 •

edited

Loading

Fidget-Spinner commented Apr 1, 2024 •

edited

Loading

Fidget-Spinner commented Apr 1, 2024 •

edited

Loading