CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods

## Summary

`IteratorRandom::choose()` and related methods call `gen_range()` much more frequently than strictly necessary when choosing from iterators without size hints. 
By reducing the number of these calls it is possible to speed up the relevant benchmarks by about 1.5-2x

Specifically 
```
# Using Pcg32
test seq_iter_unhinted_choose_from_1000      ... bench:       4,049 ns/iter (+/- 377) #old
test seq_iter_unhinted_choose_from_1000      ... bench:       2,730 ns/iter (+/- 335) #new

# Using ChaChaRng
test seq_iter_unhinted_choose_from_1000      ... bench:       6,831 ns/iter (+/- 317) #old
test seq_iter_unhinted_choose_from_1000      ... bench:       3,447 ns/iter (+/- 627) #new
```


This would not require a breaking change to the API or violate any explicit promises but if this change is made then different items will be randomly returned from the `choose()` method and the `Rng` will be in a different state afterwards which may affect some users. 

## Details

The `IteratorRandom::choose()` and related methods would have to change.  
The way they work now (for iterators without size hints) is that for every item, they generate a random number in the range `0..n` where `n` is the number of items seen so far. If the generated number is zero (i.e. a 1/n chance) the current result changes to the new number.  

This means that for the first twenty items, twenty random numbers are generated. My suggestion is to instead generate a number in the range 0..20! (20 factorial is the largest factorial less than `u64::MAX`) and use this random number instead of the first twenty random numbers. 

A number in that range has a 1 in 2 chance of being 0 mod 2, then after division by 2, it has a 1 in 3 chance of being 0 mod 3 and so on up to 20.

After the first twenty numbers you then have to use 33!/20! which only gives you 13 numbers and the efficiency gradually decreases with larger n. 
This seems like a lot of work but it's still much cheaper than generating new random numbers, especially with cryptographically secure rngs. 

There is no measurable performance change when using `choose()` with a fixed slice iterator (the code for that part is unchanged).  
When using an iterator with a size hint but not a fixed size, performance seems to have degraded very slightly.  It would be possible to apply this optimization to that case as well though.

I believe the results of the functions would be consistent between 32 and 64 bit architectures (I am generating `u64`, not `usize`). I don't have 32 bit hardware to test on so I can't comment on the performance differences.  

The following methods would also probably benefit from similar optimizations:

```
IteratorRandom::choose_stable()
IteratorRandom::choose_multiple_fill()
IteratorRandom::choose_multiple()
SliceRandom::shuffle()
SliceRandom::partial_shuffle()
```


## Motivation

Performance.

## Alternatives

The code could be left as it is and performance hungry users could use [my crate](https://crates.io/crates/kindness) which has a fast version of the choose function (I made the crate just to have a `random_max` function but got carried away optimizing it). 

If this is considered worth doing, I'm happy to submit a PR. I have written most of the code already for my own crate. 

Have a lovely day, Mark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266

Summary

Details

Motivation

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

CHANGE: Performance Optimizations for IteratorRandom::choose() and related methods #1266

Description

Summary

Details

Motivation

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266