-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Optimize sum of Durations by using custom function #51598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
r? @sfackler (someone from the libs team) |
Ping from triage @sfackler! This PR needs your review. |
@bors r+ Sorry for the delay! |
📌 Commit d22ad76 has been approved by |
bors
added a commit
that referenced
this pull request
Jun 27, 2018
Optimize sum of Durations by using custom function The current `impl Sum for Duration` uses `fold` to perform several `add`s (or really `checked_add`s) of durations. In doing so, it has to guarantee the number of nanoseconds is valid after every addition. If you squeese the current implementation into a single function it looks kind of like this: ````rust fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration { let mut sum = Duration::new(0, 0); for rhs in iter { if let Some(mut secs) = sum.secs.checked_add(rhs.secs) { let mut nanos = sum.nanos + rhs.nanos; if nanos >= NANOS_PER_SEC { nanos -= NANOS_PER_SEC; if let Some(new_secs) = secs.checked_add(1) { secs = new_secs; } else { panic!("overflow when adding durations"); } } sum = Duration { secs, nanos } } else { panic!("overflow when adding durations"); } } sum } ```` We only need to check if `nanos` is in the correct range when giving our final answer so we can have a more optimized version like so: ````rust fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration { let mut total_secs: u64 = 0; let mut total_nanos: u64 = 0; for entry in iter { total_secs = total_secs .checked_add(entry.secs) .expect("overflow in iter::sum over durations"); total_nanos = match total_nanos.checked_add(entry.nanos as u64) { Some(n) => n, None => { total_secs = total_secs .checked_add(total_nanos / NANOS_PER_SEC as u64) .expect("overflow in iter::sum over durations"); (total_nanos % NANOS_PER_SEC as u64) + entry.nanos as u64 } }; } total_secs = total_secs .checked_add(total_nanos / NANOS_PER_SEC as u64) .expect("overflow in iter::sum over durations"); total_nanos = total_nanos % NANOS_PER_SEC as u64; Duration { secs: total_secs, nanos: total_nanos as u32, } } ```` We now only convert `total_nanos` to `total_secs` (1) if `total_nanos` overflows and (2) at the end of the function when we have to output a valid `Duration`. This gave a 5-22% performance improvement when I benchmarked it, depending on how big the `nano` value of the `Duration`s in `iter` were.
☀️ Test successful - status-appveyor, status-travis |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
S-waiting-on-bors
Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current
impl Sum for Duration
usesfold
to perform severaladd
s (or reallychecked_add
s) of durations. In doing so, it has to guarantee the number of nanoseconds is valid after every addition. If you squeese the current implementation into a single function it looks kind of like this:We only need to check if
nanos
is in the correct range when giving our final answer so we can have a more optimized version like so:We now only convert
total_nanos
tototal_secs
(1) iftotal_nanos
overflows and (2) at the end of the function when we have to output a validDuration
. This gave a 5-22% performance improvement when I benchmarked it, depending on how big thenano
value of theDuration
s initer
were.