Skip to content

Commit a7719a7

Browse files
committed
Expand the testing guide to cover optimizations, benchmarks and how to
be more precise about what's being benchmarked. Also, reorganise the layout a bit, to put examples directly in their sections.
1 parent 3844734 commit a7719a7

File tree

1 file changed

+130
-56
lines changed

1 file changed

+130
-56
lines changed

src/doc/guide-testing.md

+130-56
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,12 @@ fn return_two_test() {
1616
}
1717
~~~
1818

19-
To run these tests, use `rustc --test`:
19+
To run these tests, compile with `rustc --test` and run the resulting
20+
binary:
2021

2122
~~~ {.notrust}
22-
$ rustc --test foo.rs; ./foo
23+
$ rustc --test foo.rs
24+
$ ./foo
2325
running 1 test
2426
test return_two_test ... ok
2527
@@ -47,8 +49,8 @@ value. To run the tests in a crate, it must be compiled with the
4749
`--test` flag: `rustc myprogram.rs --test -o myprogram-tests`. Running
4850
the resulting executable will run all the tests in the crate. A test
4951
is considered successful if its function returns; if the task running
50-
the test fails, through a call to `fail!`, a failed `check` or
51-
`assert`, or some other (`assert_eq`, ...) means, then the test fails.
52+
the test fails, through a call to `fail!`, a failed `assert`, or some
53+
other (`assert_eq`, ...) means, then the test fails.
5254

5355
When compiling a crate with the `--test` flag `--cfg test` is also
5456
implied, so that tests can be conditionally compiled.
@@ -100,7 +102,63 @@ failure output difficult. In these cases you can set the
100102
`RUST_TEST_TASKS` environment variable to 1 to make the tests run
101103
sequentially.
102104

103-
## Benchmarking
105+
## Examples
106+
107+
### Typical test run
108+
109+
~~~ {.notrust}
110+
$ mytests
111+
112+
running 30 tests
113+
running driver::tests::mytest1 ... ok
114+
running driver::tests::mytest2 ... ignored
115+
... snip ...
116+
running driver::tests::mytest30 ... ok
117+
118+
result: ok. 28 passed; 0 failed; 2 ignored
119+
~~~
120+
121+
### Test run with failures
122+
123+
~~~ {.notrust}
124+
$ mytests
125+
126+
running 30 tests
127+
running driver::tests::mytest1 ... ok
128+
running driver::tests::mytest2 ... ignored
129+
... snip ...
130+
running driver::tests::mytest30 ... FAILED
131+
132+
result: FAILED. 27 passed; 1 failed; 2 ignored
133+
~~~
134+
135+
### Running ignored tests
136+
137+
~~~ {.notrust}
138+
$ mytests --ignored
139+
140+
running 2 tests
141+
running driver::tests::mytest2 ... failed
142+
running driver::tests::mytest10 ... ok
143+
144+
result: FAILED. 1 passed; 1 failed; 0 ignored
145+
~~~
146+
147+
### Running a subset of tests
148+
149+
~~~ {.notrust}
150+
$ mytests mytest1
151+
152+
running 11 tests
153+
running driver::tests::mytest1 ... ok
154+
running driver::tests::mytest10 ... ignored
155+
... snip ...
156+
running driver::tests::mytest19 ... ok
157+
158+
result: ok. 11 passed; 0 failed; 1 ignored
159+
~~~
160+
161+
# Microbenchmarking
104162

105163
The test runner also understands a simple form of benchmark execution.
106164
Benchmark functions are marked with the `#[bench]` attribute, rather
@@ -111,11 +169,12 @@ component of your testsuite, pass `--bench` to the compiled test
111169
runner.
112170

113171
The type signature of a benchmark function differs from a unit test:
114-
it takes a mutable reference to type `test::BenchHarness`. Inside the
115-
benchmark function, any time-variable or "setup" code should execute
116-
first, followed by a call to `iter` on the benchmark harness, passing
117-
a closure that contains the portion of the benchmark you wish to
118-
actually measure the per-iteration speed of.
172+
it takes a mutable reference to type
173+
`extra::test::BenchHarness`. Inside the benchmark function, any
174+
time-variable or "setup" code should execute first, followed by a call
175+
to `iter` on the benchmark harness, passing a closure that contains
176+
the portion of the benchmark you wish to actually measure the
177+
per-iteration speed of.
119178

120179
For benchmarks relating to processing/generating data, one can set the
121180
`bytes` field to the number of bytes consumed/produced in each
@@ -128,15 +187,16 @@ For example:
128187
~~~
129188
extern mod extra;
130189
use std::vec;
190+
use extra::test::BenchHarness;
131191
132192
#[bench]
133-
fn bench_sum_1024_ints(b: &mut extra::test::BenchHarness) {
193+
fn bench_sum_1024_ints(b: &mut BenchHarness) {
134194
let v = vec::from_fn(1024, |n| n);
135195
b.iter(|| {v.iter().fold(0, |old, new| old + *new);} );
136196
}
137197
138198
#[bench]
139-
fn initialise_a_vector(b: &mut extra::test::BenchHarness) {
199+
fn initialise_a_vector(b: &mut BenchHarness) {
140200
b.iter(|| {vec::from_elem(1024, 0u64);} );
141201
b.bytes = 1024 * 8;
142202
}
@@ -163,74 +223,88 @@ Advice on writing benchmarks:
163223
To run benchmarks, pass the `--bench` flag to the compiled
164224
test-runner. Benchmarks are compiled-in but not executed by default.
165225

166-
## Examples
167-
168-
### Typical test run
169-
170226
~~~ {.notrust}
171-
> mytests
227+
$ rustc mytests.rs -O --test
228+
$ mytests --bench
172229
173-
running 30 tests
174-
running driver::tests::mytest1 ... ok
175-
running driver::tests::mytest2 ... ignored
176-
... snip ...
177-
running driver::tests::mytest30 ... ok
230+
running 2 tests
231+
test bench_sum_1024_ints ... bench: 709 ns/iter (+/- 82)
232+
test initialise_a_vector ... bench: 424 ns/iter (+/- 99) = 19320 MB/s
178233
179-
result: ok. 28 passed; 0 failed; 2 ignored
180-
~~~ {.notrust}
234+
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
235+
~~~
181236

182-
### Test run with failures
237+
## Benchmarks and the optimizer
183238

184-
~~~ {.notrust}
185-
> mytests
239+
Benchmarks compiled with optimizations activated can be dramatically
240+
changed by the optimizer so that the benchmark is no longer
241+
benchmarking what one expects. For example, the compiler might
242+
recognize that some calculation has no external effects and remove
243+
it entirely.
186244

187-
running 30 tests
188-
running driver::tests::mytest1 ... ok
189-
running driver::tests::mytest2 ... ignored
190-
... snip ...
191-
running driver::tests::mytest30 ... FAILED
245+
~~~
246+
extern mod extra;
247+
use extra::test::BenchHarness;
192248
193-
result: FAILED. 27 passed; 1 failed; 2 ignored
249+
#[bench]
250+
fn bench_xor_1000_ints(bh: &mut BenchHarness) {
251+
bh.iter(|| {
252+
range(0, 1000).fold(0, |old, new| old ^ new);
253+
});
254+
}
194255
~~~
195256

196-
### Running ignored tests
257+
gives the following results
197258

198259
~~~ {.notrust}
199-
> mytests --ignored
200-
201-
running 2 tests
202-
running driver::tests::mytest2 ... failed
203-
running driver::tests::mytest10 ... ok
260+
running 1 test
261+
test bench_xor_1000_ints ... bench: 0 ns/iter (+/- 0)
204262
205-
result: FAILED. 1 passed; 1 failed; 0 ignored
263+
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
206264
~~~
207265

208-
### Running a subset of tests
266+
The benchmarking runner offers two ways to avoid this. Either, the
267+
closure that the `iter` method receives can return an arbitrary value
268+
which forces the optimizer to consider the result used and ensures it
269+
cannot remove the computation entirely. This could be done for the
270+
example above by adjusting the `bh.iter` call to
209271

210-
~~~ {.notrust}
211-
> mytests mytest1
272+
~~~
273+
bh.iter(|| range(0, 1000).fold(0, |old, new| old ^ new))
274+
~~~
212275

213-
running 11 tests
214-
running driver::tests::mytest1 ... ok
215-
running driver::tests::mytest10 ... ignored
216-
... snip ...
217-
running driver::tests::mytest19 ... ok
276+
Or, the other option is to call the generic `extra::test::black_box`
277+
function, which is an opaque "black box" to the optimizer and so
278+
forces it to consider any argument as used.
218279

219-
result: ok. 11 passed; 0 failed; 1 ignored
220280
~~~
281+
use extra::test::black_box
221282
222-
### Running benchmarks
283+
bh.iter(|| {
284+
black_box(range(0, 1000).fold(0, |old, new| old ^ new));
285+
});
286+
~~~
223287

224-
~~~ {.notrust}
225-
> mytests --bench
288+
Neither of these read or modify the value, and are very cheap for
289+
small values. Larger values can be passed indirectly to reduce
290+
overhead (e.g. `black_box(&huge_struct)`).
226291

227-
running 2 tests
228-
test bench_sum_1024_ints ... bench: 709 ns/iter (+/- 82)
229-
test initialise_a_vector ... bench: 424 ns/iter (+/- 99) = 19320 MB/s
292+
Performing either of the above changes gives the following
293+
benchmarking results
230294

231-
test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
295+
~~~ {.notrust}
296+
running 1 test
297+
test bench_xor_1000_ints ... bench: 375 ns/iter (+/- 148)
298+
299+
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
232300
~~~
233301

302+
However, the optimizer can still modify a testcase in an undesirable
303+
manner even when using either of the above. Benchmarks can be checked
304+
by hand by looking at the output of the compiler using the `--emit=ir`
305+
(for LLVM IR), `--emit=asm` (for assembly) or compiling normally and
306+
using any method for examining object code.
307+
234308
## Saving and ratcheting metrics
235309

236310
When running benchmarks or other tests, the test runner can record

0 commit comments

Comments
 (0)