-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-43706: Use PEP 590 vectorcall to speed up enumerate() #25154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not against such micro-optimization, but I'm not convinced that it's worth it compared to the size of the code to parse arguments: enum_vectorcall() code. If bpo-43447 is implemented, I would be more comfortable to accept such micro-optimization. Right now, it adds many lines of code that should be maintained manually.
The micro-benchmark is measure the creation of the enumerate object, it does not iterate it. I expect for that long sequence, the benefit is not significant. But for short sequence, it is more likely interesting.
@@ -80,6 +80,45 @@ enum_new_impl(PyTypeObject *type, PyObject *iterable, PyObject *start) | |||
return (PyObject *)en; | |||
} | |||
|
|||
// TODO: Use AC when bpo-43447 is supported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to track these tasks in the bpo rather than directly in the Python source code.
This is expected worry when I try to implement this. I also agree with you. |
This PR is stale because it has been open for 30 days with no activity. |
Can you please re-run you benchmark? Maybe Python performance changed in the meanwhile. |
Same effect.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
bench enumerate | 533 ns | 341 ns: 1.56x faster
is worth it.
As mentioned in the issue, this fixes a regression in 3.11. The regression was introduced in pythonGH-25154 (bpo-43706). There were already comments there about how this was too much code for a simple change. This makes it even worse.
https://bugs.python.org/issue43706