GH-107596: Specialize `str[int]` #107597

brandtbucher · 2023-08-03T18:18:20Z

This handles cases where a string (with any internal representation) is indexed by a medium nonnegative integer, resulting in an ASCII character.

Running the benchamarks/stats now...

Issue: Specialize str[int] #107596

TeamSpen210 · 2023-08-03T22:01:05Z

Python/specialize.c

            PySlice_Check(sub) ? SPEC_FAIL_SUBSCR_TUPLE_SLICE : SPEC_FAIL_OTHER);
        goto fail;
    }
+    if (container_type == &PyUnicode_Type) {


Would it be a good idea to check if the string is all ASCII, so we don't immediately deoptimise in that case? Or just support all characters by calling unicode_char in the opcode case, it's the last branch anyway.

Yeah, I think I'll just change the specialization to only support ASCII strings, rather than bothering with other widths. It's quick and easy to check, since strings know their encoding.

brandtbucher · 2023-08-04T22:21:13Z

Well, I think I know which new benchmark is responsible for this showing up so prominently in the failure stats...

I'm going to modify this to only support ASCII strings, and see if that further improves things.

brandtbucher · 2023-08-07T19:28:02Z

I'm going to revert that last change. Turns out the TOML benchmark that's hitting this so hard is indeed getting mostly ASCII characters out of "barely-Unicode" strings. So the original code is probably fine.

brandtbucher · 2023-08-08T00:22:39Z

I'm happy with the benchmarks and stats.

brandtbucher added 2 commits August 3, 2023 11:07

Specialize str[int]

9e03844

blurb add

ccddc2b

brandtbucher added performance Performance or resource usage interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Aug 3, 2023

brandtbucher requested a review from markshannon August 3, 2023 18:18

brandtbucher self-assigned this Aug 3, 2023

bedevere-bot mentioned this pull request Aug 3, 2023

Specialize str[int] #107596

Closed

bedevere-bot added the awaiting core review label Aug 3, 2023

fixup

9760665

TeamSpen210 reviewed Aug 3, 2023

View reviewed changes

brandtbucher added 4 commits August 4, 2023 16:04

Specialize for ASCII only

41097d1

Catch up with main

570b242

fixup

e03a937

Support ASCII characters from Unicode strings

f077383

brandtbucher merged commit ea72c6f into python:main Aug 8, 2023

bedevere-bot removed the awaiting core review label Aug 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GH-107596: Specialize `str[int]` #107597

GH-107596: Specialize `str[int]` #107597

Uh oh!

brandtbucher commented Aug 3, 2023 •

edited by bedevere-bot

Loading

Uh oh!

TeamSpen210 Aug 3, 2023

Uh oh!

brandtbucher Aug 4, 2023

Uh oh!

brandtbucher commented Aug 4, 2023

Uh oh!

brandtbucher commented Aug 7, 2023

Uh oh!

brandtbucher commented Aug 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

GH-107596: Specialize str[int] #107597

GH-107596: Specialize str[int] #107597

Uh oh!

Conversation

brandtbucher commented Aug 3, 2023 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TeamSpen210 Aug 3, 2023

Choose a reason for hiding this comment

Uh oh!

brandtbucher Aug 4, 2023

Choose a reason for hiding this comment

Uh oh!

brandtbucher commented Aug 4, 2023

Uh oh!

brandtbucher commented Aug 7, 2023

Uh oh!

brandtbucher commented Aug 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GH-107596: Specialize `str[int]` #107597

GH-107596: Specialize `str[int]` #107597

brandtbucher commented Aug 3, 2023 •

edited by bedevere-bot

Loading