Iterator::max with reference-type items cannot leverage SIMD instructions, resulting in low performance

When manipulating array of numbers, it is pretty common to have to find the min/max/sum/... of it. While discussing about internals with fellow developers, someone pointed out that the C# max method leverages SIMD. By curiosity I checked for both C++ and Rust.

My findings are as follow:
- LLVM is able to auto vectorize this kind of stuff
- the C++ STL max_element function leverages that
- my custom implementation is able to leverages that
- the Rust Iter functions (max, min) cannot

This last bullet is due to the fact that the implementation does not expect the type to implement the Copy trait, and operates over references, and not actual type of the array.

```Rust
let my_array = (0..ITEM_COUNT).collect::<Vec<_>>();

// This is slow
#[inline(never)]
pub fn stdlib_max<T: Ord + Copy>(a: &[T]) -> Option<T> {
    a.iter().max().copied()
}

// This is fast
#[inline(never)]
pub fn custom_max<T: Ord + Copy>(a: &[T]) -> Option<T> {
    let first = *a.first()?;
    Some(a.iter().fold(first, |x, y| std::cmp::max(x, *y)))
}
```

=> Still, as an end user, I would have expected that the "rust way" to do the thing (with iterator) would be optimal, and it is not.

I link a small repository with a sample and bench pointing the issue:

[https://github.com/jfaixo/rust-max-bench]

For finding the max of a `[i32; 100_000]` array :

```bash
❯ rustc -vV
rustc 1.68.0-nightly (388538fc9 2023-01-05)
binary: rustc
commit-hash: 388538fc963e07a94e3fc3ac8948627fd2d28d29
commit-date: 2023-01-05
host: x86_64-unknown-linux-gnu
release: 1.68.0-nightly
LLVM version: 15.0.6

❯ cargo bench
    Finished bench [optimized] target(s) in 0.00s
     Running unittests src/lib.rs (target/release/deps/rust_max_bench-a5d988f9520f9dde)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/bench.rs (target/release/deps/bench-cf556ddbd1b864fb)

running 3 tests
test custom    ... bench:       8,052 ns/iter (+/- 385)
test itertools ... bench:      94,027 ns/iter (+/- 816)
test stdlib    ... bench:      94,477 ns/iter (+/- 1,545)

test result: ok. 0 passed; 0 failed; 0 ignored; 3 measured; 0 filtered out; finished in 2.40s
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iterator::max with reference-type items cannot leverage SIMD instructions, resulting in low performance #106539

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iterator::max with reference-type items cannot leverage SIMD instructions, resulting in low performance #106539

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions