-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
gh-100687: Reduce frequency of overallocation in long_add #100688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I tested the PR with the (some profiling with valgrind shows that only 11% is spend in the In the |
Co-authored-by: Pieter Eendebak <[email protected]>
add test for long addition with over allocation
|
This needs a couple more changes thanks to the recent changes in long format. Converting to draft while I investigate. |
Should be good to go now. I've removed the special-case handling of normalization: it seems safer to call |
|
@eendebakpt Thank you for looking into this, and for the suggestions and test! I fear that my changes have invalidated your benchmarks, but I suspect the take-away that performance isn't significantly affected either way remains valid. |
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Addresses #100687; see the issue for discussion and rationale.
This PR does some extra work to get a tighter upper bound on the number of digits needed of addition of two like-signed multidigit ints (or subtraction of two oppositely-signed multidigit ints).
Here's a benchmark script, tracking both memory allocations and performance in the case that's most significant, where both of the addends have exactly two digits (the case of single digit ints takes a fast path and doesn't use
x_add)Some sample results on my macOS / Intel machine. (
cpythonismain,cpython-modifiedis this branch; both were built with--enable-optimizations). Surprisingly, I actually get a minor speed increase where I was expecting a decrease.To do: