Skip to content

Add fold_mut alternative to fold #481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Add fold_mut alternative to fold #481

wants to merge 1 commit into from

Conversation

mlodato517
Copy link

@mlodato517 mlodato517 commented Sep 21, 2020

This PR
Adds a fold_mut alternative to fold

Why?

  1. Sometimes the closure in a fold requires an awkward final line to return the accumulator as mentioned here.
  2. Sometimes fold_mut can be faster. And sometimes it can be slower! See the benchmarks here. TL;DR, if the accumulator is at least as big as a &mut to the accumulator, fold can be slower than fold_mut as moving the larger accumulator is more work than moving the &mut.

Background
A lot of the background is in this issue and this PR. Based on the discussion there, it seemed that this wasn't warranted for the standard library so I figured I'd pitch it here!

Benchmarks
I ran the benchmarks like:

cargo bench "fold sum accumulator" \
  && cargo bench "fold vec accumulator" \
  && cargo bench "fold chained iterator with num accumulator" \
  && cargo bench "fold chained iterator with vec accumulator" \
  && open target/criterion/report/index.html

(not sure if there's a shorter syntax) and saw:

What Benchmark
folding an i64 num
folding a vec vec
folding a chained iterator into an i64 i64_chain
folding a chained iterator into a vec vec_chain

I could also add a benchmark with like an i8 or something to show how much faster fold is when folding into something much smaller than a reference if we want that! I just had a hard time coming up with an accumulator that wouldn't overflow but would still be super readable. It might be like (0i8..10).chain(-10i8..0).cycle().take(1_000_000) or some such.

group.bench_function("fold", |b| {
b.iter(|| {
(0i64..1_000_000)
.chain(0i64..1_000_000)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a benchmark with chain to show that, because fold_mut uses for_each, it benefits from Chain's specialization of fold

/// # Examples
/// ```
/// # use itertools::Itertools;
/// let evens = [1, 2, 3, 4, 5, 6].iter().fold_mut(Vec::new(), |evens, &num| {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this example too lame? I was going to use the example of counting but ... there's .counts right above this so that seemed worse haha

@mlodato517 mlodato517 marked this pull request as ready for review September 21, 2020 17:47
@phimuemue
Copy link
Member

Sometimes the closure in a fold requires an awkward final line to return the accumulator as mentioned here.

This bugged me in some of my projects, too. However, I realized that I then needed e.g. fold1_mut and others, as well.

Thus, I solved this in a different way: I introduced mutate_return that "lifts" a Fn(&mut S, A) to Fn(S, A)->S (see here) and makes this independent of fold.

Sometimes fold_mut can be faster.

In my implementation I assumed that the compiler is clever enough to optimize it as good as possible in any case - which, reading your PR, seems wrong.

So, my questions:

  1. Should we also offer fold1_mut and others? (How far should we go, then?)
  2. In an ideal world, I think the compiler should take care of the optimizations necessary, so that using fold vs. fold_mut should result in identical, most efficient code. Is it foreseeable that the compiler becomes that clever? (If so, I'd be reluctant to introduce fold_mut.)

@mlodato517
Copy link
Author

Should we also offer fold1_mut and others? (How far should we go, then?)

Great question! I have no idea 🤔 I imagine it'd be great to have this variant of all the methods since you can implement fold_mut in terms of fold (although, of course, you get the performance hit back)

Is it foreseeable that the compiler becomes that clever?

Maybe! There's a little discussion going on here. I do think that, if we could get fold to be as fast as fold_mut, then that'd be great, but I still wonder if the ergonomics of fold_mut might be desirable in some cases (although, maybe we wait for the clippy lint to be discussed?).

@mlodato517
Copy link
Author

Just checking back in for some quick guidance here - should I be working to implement fold_mut for the other fold variants or should we close this?

@mlodato517
Copy link
Author

I think this is sufficiently stale to close :-)

@mlodato517 mlodato517 closed this Nov 5, 2020
@mlodato517 mlodato517 deleted the ml-fold-mut branch November 5, 2020 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants