Skip to content

gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays #133396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 5, 2025

Conversation

@iritkatriel iritkatriel changed the title gh-133395: add option for extension modules to specialise BINARY_OP/SUBSCR, apply to arrays gh-133395: add option for extension modules to specialize BINARY_OP/SUBSCR, apply to arrays May 4, 2025
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the _PyBinopSpecializationDescr supposed to be extensible?
If so we'll need a means for extensions to free the structs they have allocated. Or is the intention that they are statically allocated?
If they aren't extensible, maybe the VM should allocate them and clean them up?

@iritkatriel
Copy link
Member Author

Is the _PyBinopSpecializationDescr supposed to be extensible?

I assumed it's extensible and the extension manages it. The alternative is that it has a void* field that the extension sets and manages. The advantage of letting extensions allocate is that if there is no dynamic information, you don't need to allocate a new one for each instruction that uses the specialisation.

@eendebakpt
Copy link
Contributor

Are there any microbenchmarks that show this improves performance?

Co-authored-by: Bénédikt Tran <[email protected]>
@iritkatriel iritkatriel merged commit 082dbf7 into python:main May 5, 2025
60 checks passed
}
break;
}
/* _GUARD_BINARY_OP_EXTEND is not a viable micro-op for tier 2 because it uses the 'this_instr' variable */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this change made it so that any traces containing BINARY_OP_EXTEND can't be JIT-compiled. Is there a way to rewrite this to be tier-2 friendly? Mutating inline caches doesn't work there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't have to NULL the cache, just change the opcode to BINARY_OP and hope nobody tries to use the pointer in the cache anymore. Will make it harder to debug if they do.

@brandtbucher
Copy link
Member

Also, based on the benchmarking results that were posted, it looks like the coverage, sqlalchemy_declarative, and sqlalchemy_imperative benchmarks crashed on this branch but succeeded on the base commit. Were these fixed before merging?

@nascheme
Copy link
Member

nascheme commented May 6, 2025

This seems to cause the refleaks unit test to fail for the test_datetime module. The leak seems to be related to the ZoneInfoTest class, which uses the array.array type.

@iritkatriel
Copy link
Member Author

The crash should have been fixed. I'll check the leak.

Re tier 2 - maybe there's a way but we can just deopt them when jitting if not.

@vstinner
Copy link
Member

vstinner commented May 6, 2025

This seems to cause the refleaks unit test to fail for the test_datetime module. The leak seems to be related to the ZoneInfoTest class, which uses the array.array type.

The follow code is enough to reprodue the issue:

import array
import gc

def func():
    a=array.array('B', b'hello world')
    # call array_binop_specialize() 3 times, once per loop
    print("specialize")
    for _ in range(10): a[1]
    print("specialize")
    for _ in range(10): a[1]
    print("specialize")
    for _ in range(10): a[1]

func()

# array_subscr_free() is not called
del func
gc.collect()

The specialization calls PyMem_Malloc() in array_binop_specialize(), but nothing calls PyMem_Free(): array_subscr_free() is not called.

@Eclips4
Copy link
Member

Eclips4 commented May 6, 2025

It seems that array_subscr_guard behaves incorrectly. It never returns 0 (at least I can't make it return 0), which is needed to trigger array_subscr_free.

@iritkatriel
Copy link
Member Author

It seems that array_subscr_guard behaves incorrectly. It never returns 0 (at least I can't make it return 0), which is needed to trigger array_subscr_free.

I don't think that's the issue. Free needs to happen when the code object is freed, and currently it's not.

iritkatriel added a commit to iritkatriel/cpython that referenced this pull request May 6, 2025
…ze BINARY_OP/SUBSCR, apply to arrays (python#133396)"

This reverts commit 082dbf7.
@iritkatriel
Copy link
Member Author

I'm going to revert this so it doesn't delay the release today.
#133498

It's not terribly important for this to be in 3.14.

hugovk pushed a commit that referenced this pull request May 6, 2025
@brandtbucher
Copy link
Member

@iritkatriel I just ran JIT benchmarks on the head of this branch. Those three benchmarks are still broken, and it makes the JIT 1% slower (with individual benchmarks slowing down up to 25%, one benchmark using 8% more memory):

https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20250505-3.14.0a7%2B-454bfc5-JIT/README.md#vs-base-1

So we should make sure to benchmark the JIT when this lands again, or find a new approach (I don't love having strong references in the inline caches, even just pointers to owned memory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants