-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Asymptotically faster int(string)
#118750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
performance
Performance or resource usage
Comments
int(string)
int(string)
tim-one
added a commit
that referenced
this issue
May 19, 2024
Asymptotically faster (O(n log n)) str->int for very large strings, leveraging the faster multiplication scheme in the C-coded `_decimal` when available. This is used instead of the current Karatsuba-limited method starting at 2 million digits. Lots of opportunity remains for fine-tuning. Good targets include changing BYTELIM, and possibly changing the internal output base (from 256 to a higher number of bytes). Doing this was substantial work, and many of the new lines are actually comments giving correctness proofs. The obvious approaches sticking to integers were too slow to be useful, so this is doing variable-precision decimal floating-point arithmetic. Much faster, but worst-possible rounding errors have to be wholly accounted for, using as little precision as possible. Special thanks to Serhiy Storchaka for asking many good questions in his code reviews! Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: sstandre <[email protected]> Co-authored-by: Pieter Eendebak <[email protected]> Co-authored-by: Nice Zombies <[email protected]>
estyxx
pushed a commit
to estyxx/cpython
that referenced
this issue
Jul 17, 2024
Asymptotically faster (O(n log n)) str->int for very large strings, leveraging the faster multiplication scheme in the C-coded `_decimal` when available. This is used instead of the current Karatsuba-limited method starting at 2 million digits. Lots of opportunity remains for fine-tuning. Good targets include changing BYTELIM, and possibly changing the internal output base (from 256 to a higher number of bytes). Doing this was substantial work, and many of the new lines are actually comments giving correctness proofs. The obvious approaches sticking to integers were too slow to be useful, so this is doing variable-precision decimal floating-point arithmetic. Much faster, but worst-possible rounding errors have to be wholly accounted for, using as little precision as possible. Special thanks to Serhiy Storchaka for asking many good questions in his code reviews! Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: sstandre <[email protected]> Co-authored-by: Pieter Eendebak <[email protected]> Co-authored-by: Nice Zombies <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Proposal:
EDIT: I made better progress on this than I expected, reducing the cutoff dramatically to about 3.4 million digits, by revisiting and improving the "cached reciprocal approximation" approach. Still not really compelling, but ... what the heck 😉.
Way back when we were first working on ideas for faster
int <-> decimal string
conversions, @pochmann worked up a function that used thedecimal
module to, in effect, convert from base 10 to base 256. This is "unnatural", in that it requires multiplying and dividing by large powers of 2, whichdecimal
isn't naturally suited to. Butdecimal
's*
and/
are asymptotically superior to CPython's, so at some point it could be expected to win.Alas, the crossover point was too high to be of much real interest. I then worked on ways to replace its division with multiplication by a cached reciprocal approximation instead, fixing up (usually small) errors afterwards. This reduced the crossover point significantly, but still too high to be of much real interest. And the code grew increasingly complex.
While working on
_pylong.py
across the last few days on a different issue, I had a different idea: there's no actual need to divide by powers of 2 at all. Since2*5 = 10
, division by2**k
is the same as multiplying by5**k
(exact) and then dividing by10**k
(also exact indecimal
, and dirt cheap: the module'sscaleb()
function, which just adjusts the internal exponent).This yields short and reasonably clear code with a much lower crossover value, at about 26 million digits. That's still too high to be compelling, but I want to record the progress in case someone else is intrigued enough to pursue it (I'm out of time for this for now). So I intend to add the (working, but unused) code to
_pylong.py
.Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Linked PRs
int(string)
#118751The text was updated successfully, but these errors were encountered: