-
-
Notifications
You must be signed in to change notification settings - Fork 32k
GH-130296: Avoid stack transients in four instructions. #130310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-130296: Avoid stack transients in four instructions. #130310
Conversation
…_KEYS into _LOAD_GLOBAL_MODULE
…ROM_KEYS into _LOAD_GLOBAL_BUILTINS
…into _LOAD_ATTR_MODULE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this seems like a nice improvement. I've given this a thorough look and have left a handful of comments, most of them to clarify my understanding and a couple to suggest some slight readability improvements.
While I can follow most of it, there are still a few gaps in my understanding of this code. Consequently, I'm not 100% confident about assessing the complete correctness of this change. That said, nothing amiss jumped out at me.
FWIW, the changes would have been easier to review if this had been split into one PR for each of the five high-level changes.
When you're done making the requested changes, leave the comment: |
I have made the requested changes; please review again |
Thanks for making the requested changes! @ericsnowcurrently: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I've left one note about adding a comment to clarify something minor. Otherwise I'd say this is good to go.
# Add expansion for the second operand | ||
internal_offset = 0 | ||
for cache in part.caches[:-1]: | ||
internal_offset += cache.size | ||
expansions.append((part.name, f"OPERAND1_{part.caches[-1].size}", offset+internal_offset)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a comment to indicate why we are looping over part.caches
if there are only 2:
# Add expansion for the second operand | |
internal_offset = 0 | |
for cache in part.caches[:-1]: | |
internal_offset += cache.size | |
expansions.append((part.name, f"OPERAND1_{part.caches[-1].size}", offset+internal_offset)) | |
# Add expansion for the second operand. It will use the last cache entry. | |
# We must account for unused cache entries, which is why we must iterate | |
# over part.caches. | |
internal_offset = 0 | |
for cache in part.caches[:-1]: | |
internal_offset += cache.size | |
expansions.append((part.name, f"OPERAND1_{part.caches[-1].size}", offset+internal_offset)) |
…GH-130310) * Combine _GUARD_GLOBALS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_MODULE_FROM_KEYS into _LOAD_GLOBAL_MODULE * Combine _GUARD_BUILTINS_VERSION_PUSH_KEYS and _LOAD_GLOBAL_BUILTINS_FROM_KEYS into _LOAD_GLOBAL_BUILTINS * Combine _CHECK_ATTR_MODULE_PUSH_KEYS and _LOAD_ATTR_MODULE_FROM_KEYS into _LOAD_ATTR_MODULE * Remove stack transient in LOAD_ATTR_WITH_HINT
This PR:
_GUARD_GLOBALS_VERSION_PUSH_KEYS
and_LOAD_GLOBAL_MODULE_FROM_KEYS
into_LOAD_GLOBAL_MODULE
_GUARD_BUILTINS_VERSION_PUSH_KEYS
and_LOAD_GLOBAL_BUILTINS_FROM_KEYS
into_LOAD_GLOBAL_BUILTINS
_CHECK_ATTR_MODULE_PUSH_KEYS
and_LOAD_ATTR_MODULE_FROM_KEYS
into_LOAD_ATTR_MODULE
_CHECK_ATTR_WITH_HINT
into_LOAD_ATTR_WITH_HINT