Improve kmerge's perf #101

bluss · 2016-02-23T20:58:17Z

Use a simple heap / sift down implementation to optimize kmerge

A hand rolled heap implementation allows us to improve kmerge. This
follows Frank McSherry's implementation (differential-dataflow project).

Use a sift down that places elements eagerly (alternate strategy would
be to sift down all the way to the bottom, then sift back up).

Keep the iterator in place, and sift it down from the top. This assumes
the current minimal iterator is unlikely to move very far to adjust for
its new element. This should perform well if there is a run of least
elements from the same iterator.

Benchmark

Large wins on the small (2 iterators) bench because we can avoid lots of
swaps. The "tenway" bench uses random elements, so iterators should
alternate a lot, so it's somewhat of a worst case.

before
test kmerge_default        ... bench:     123,363 ns/iter (+/- 15,518)
test kmerge_tenway         ... bench:     908,695 ns/iter (+/- 270,372)

after
test kmerge_default        ... bench:       9,123 ns/iter (+/- 235)
test kmerge_tenway         ... bench:     368,562 ns/iter (+/- 5,517)

A hand rolled heap implementation allows us to improve kmerge. This follows Frank McSherry's implementation (differential-dataflow project). Use a sift down that places elements eagerly (alternate strategy would be to sift down all the way to the bottom, then sift back up). Keep the iterator in place, and sift it down from the top. This assumes the current minimal iterator is unlikely to move very far to adjust for its new element. This should perform well if there is a run of least elements from the same iterator. Benchmark Large wins on the small (2 iterators) bench because we can avoid lots of swaps. The "tenway" bench uses random elements, so iterators should alternate a lot, so it's somewhat of a worst case. ``` before test kmerge_default ... bench: 123,363 ns/iter (+/- 15,518) test kmerge_tenway ... bench: 908,695 ns/iter (+/- 270,372) after test kmerge_default ... bench: 13,738 ns/iter (+/- 3,318) test kmerge_tenway ... bench: 380,020 ns/iter (+/- 9,921) ```

bluss · 2016-02-23T20:59:40Z

cc @bsteinb

Maybe I'm stealing your fun here, by implementing some improvements — apologies!

Thanks to @frankmcsherry for gracefully putting his code online so that I could read & copy it directly 😄

The custom heap also makes us ready for a "kmerge_by" adaptor (user-supplied ordering closure).

bluss · 2016-02-23T21:09:42Z

Related to #98 (but there's still avenues to explore there).

bsteinb · 2016-02-23T21:11:46Z

Steal away! 👍 My fun is not going to stand in the way of progress.

Very impressive increase in performance for the two way merge. That seems to go well beyond what I achieved with the modified BinaryHeap from std. Could you say how much you gained by inlining-by-hand HeadTails next() method?

Stopping on equality skips even more redundant swaps. It's visible in benchmarks actually. after: test kmerge_default ... bench: 9,123 ns/iter (+/- 235) test kmerge_tenway ... bench: 368,562 ns/iter (+/- 5,517)

bluss · 2016-02-23T21:19:11Z

It's hard to say for sure, just inlining itself shouldn't give anything, but this change to not move the iterator (unless we reorder the heap) is the main thing, and it's part of that.

pczarn · 2016-02-23T21:33:21Z

I had an idea for a struct similar to HeadTail which I called PrePeeked. Perhaps that's a better name.

With specializable associated types, we could avoid storing head for slice::Iter and access iter.as_slice()[0] instead.

pczarn · 2016-02-23T21:41:03Z

src/kmerge.rs

+            return None;
+        }
+        let result = if let Some(next_0) = self.heap[0].tail.next() {
+            replace(&mut self.heap[0].head, next_0)


This should go to HeadTail::next, ideally

That's nice, done.

bluss · 2016-02-23T21:56:52Z

@pczarn Hm I can like this name and you can like yours 😄. It's a internal detail so it doesn't matter much.

@pczarn

Suggested by @pczarn

Improve kmerge's perf

bluss · 2016-02-23T22:17:14Z

Thanks for the input on this!

bluss added 3 commits February 23, 2016 21:02

Make a bigger benchmark for kmerge

8d1b7e1

Move kmerge to their own file.

8d01b0f

kmerge: sift_down stopping condition should use greater or equal

df0241f

Stopping on equality skips even more redundant swaps. It's visible in benchmarks actually. after: test kmerge_default ... bench: 9,123 ns/iter (+/- 235) test kmerge_tenway ... bench: 368,562 ns/iter (+/- 5,517)

pczarn reviewed Feb 23, 2016
View reviewed changes

kmerge: Move next element logic to HeadTail::next

80f8f57

Suggested by @pczarn

bluss added a commit that referenced this pull request Feb 23, 2016

Merge pull request #101 from bluss/kmerge

51cc406

Improve kmerge's perf

bluss merged commit 51cc406 into master Feb 23, 2016

bluss deleted the kmerge branch February 23, 2016 22:17

bluss mentioned this pull request May 22, 2016

Mutable access to the highest-priority element in a BinaryHeap rust-lang/rfcs#1626

Closed

bluss mentioned this pull request Aug 25, 2017

Optimize kmerge #98

Closed

Philippe-Cholet mentioned this pull request Sep 30, 2023

cargo test --all-targets fails #768

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve kmerge's perf #101

Improve kmerge's perf #101

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bsteinb commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

pczarn commented Feb 23, 2016

Uh oh!

pczarn Feb 23, 2016

Uh oh!

bluss Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

Uh oh!

Improve kmerge's perf #101

Improve kmerge's perf #101

Uh oh!

Conversation

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bsteinb commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

pczarn commented Feb 23, 2016

Uh oh!

pczarn Feb 23, 2016

Choose a reason for hiding this comment

Uh oh!

bluss Feb 23, 2016

Choose a reason for hiding this comment

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

bluss commented Feb 23, 2016

Uh oh!

Uh oh!