-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add asv benchmark jobs to CI #5796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
123 commits
Select commit
Hold shift + click to select a range
ff0b563
Create benchmark-cron.yml
Illviljan bfb317c
move config
Illviljan 6a3a91e
Create benchmarks.yml
Illviljan 718ab63
Update asv.conf.json
Illviljan 02daf51
Update asv.conf.json
Illviljan b5aebef
Create benchmarks-label-triggered.yml
Illviljan 156a6f8
newlines at end
Illviljan 448d92c
Update benchmark_cron.yml
Illviljan 03c5c63
Delete benchmark_cron.yml
Illviljan 2465696
Update benchmark-cron.yml
Illviljan 13a1cee
rename
Illviljan 9f79f15
Create benchmarks-no-label-triggered.yml
Illviljan 5630aa2
Create README_CI.md
Illviljan 30c801e
Update benchmarks-no-label-triggered.yml
Illviljan 069acaa
Remove some test workflows.
Illviljan 3b15f98
Update benchmarks-no-label-triggered.yml
Illviljan 19727fb
Update benchmarks-no-label-triggered.yml
Illviljan 8e9db9e
Update asv.conf.json
Illviljan 7fdb4bd
Update asv.conf.json
Illviljan 07b3b39
Remove label triggered workflow
Illviljan eac0236
Update benchmarks.yml
Illviljan 459dc04
Add _skip_slow
Illviljan fab96b5
Update __init__.py
Illviljan 1cd72dd
Update __init__.py
Illviljan 15e03d0
global asv directory
Illviljan 7b85bea
Update benchmarks.yml
Illviljan eef9311
Update benchmarks.yml
Illviljan 0c9ddda
Update benchmarks.yml
Illviljan b72c984
Update benchmarks.yml
Illviljan 888e275
Update benchmarks.yml
Illviljan 4abd9c4
Update benchmarks.yml
Illviljan 5f2bfec
Update benchmarks.yml
Illviljan 1a04140
Update benchmarks.yml
Illviljan 69316c4
Update repr.py
Illviljan 61287e2
Update benchmarks.yml
Illviljan e361c5c
Update benchmarks.yml
Illviljan 49e33fc
Update benchmarks.yml
Illviljan 4c1adea
Create pandas.py
Illviljan 9826976
Update pandas.py
Illviljan f2242d3
Update pandas.py
Illviljan 64dd4ea
Create interp.py
Illviljan 607916b
Update interp.py
Illviljan e4a2011
Update interp.py
Illviljan 115f02c
Update interp.py
Illviljan 468489b
check combine
Illviljan 72e195f
Update combine.py
Illviljan 15438bd
test missing
Illviljan f71630b
Update dataarray_missing.py
Illviljan 46caf1d
Update dataarray_missing.py
Illviljan 17b28f7
Update dataarray_missing.py
Illviljan ecde362
Update dataarray_missing.py
Illviljan 94bc5ed
Update dataarray_missing.py
Illviljan 66cb6e8
Update dataarray_missing.py
Illviljan 2c481c3
Update dataarray_missing.py
Illviljan 4db5d7e
Update dataarray_missing.py
Illviljan b9a8462
Update dataarray_missing.py
Illviljan 1efb890
test unstacking
Illviljan 603bec5
Update unstacking.py
Illviljan e598c29
test reindexing
Illviljan 8fb88dc
Update reindexing.py
Illviljan 19041ec
Update reindexing.py
Illviljan 2bf57a2
test rolling
Illviljan 46c1c43
Update rolling.py
Illviljan 2a694ca
Update rolling.py
Illviljan 278c858
Update rolling.py
Illviljan f1d9bef
Update rolling.py
Illviljan 7b80778
add the tests back
Illviljan 5e66930
Update dataarray_missing.py
Illviljan fa5855e
Merge branch 'pydata:main' into asv-benchmark-cron
Illviljan 6553d58
Update rolling.py
Illviljan c068f8c
Update rolling.py
Illviljan 5a846cc
nanmean gets divide by zero errors
Illviljan 2745cbf
skip dataset_io
Illviljan 5e639aa
Update dataset_io.py
Illviljan cffc06a
Update dataset_io.py
Illviljan 2a4d32f
Update indexing.py
Illviljan 7776a82
Update indexing.py
Illviljan 1aaaa71
use parametrized
Illviljan 0a378b5
Update indexing.py
Illviljan 2e076bb
Update indexing.py
Illviljan 89e203e
Update indexing.py
Illviljan fc6a2b5
Update indexing.py
Illviljan c1d85f1
Update whats-new.rst
Illviljan 0278acc
Merge branch 'pydata:main' into asv-benchmark-cron
Illviljan b9496ae
Update dataset_io.py
Illviljan 21f7a91
Update dataset_io.py
Illviljan d1aeeda
Run benchmarks with labels only, comment out ccache
Illviljan aeab4ca
Merge branch 'main' into asv-benchmark-cron
Illviljan b7b3070
Update benchmarks.yml
Illviljan dafbea6
Update benchmarks.yml
Illviljan 206ac72
Update benchmarks.yml
Illviljan 6b5c688
Update benchmarks.yml
Illviljan aad08ac
Revert "Merge branch 'main' into asv-benchmark-cron"
Illviljan 7815882
Revert "Update benchmarks.yml"
Illviljan 087ff76
Revert "Revert "Update benchmarks.yml""
Illviljan 2b7858c
Update benchmarks.yml
Illviljan 53cd0a7
test triggering with a label
Illviljan 376e9e5
Update benchmarks.yml
Illviljan de63ee0
Update benchmarks.yml
Illviljan 771b9b5
Merge branch 'main' into asv-benchmark-cron
Illviljan eac2545
Revert "Revert "Merge branch 'main' into asv-benchmark-cron""
Illviljan a66eef5
Update benchmarks.yml
Illviljan 884b24a
Update benchmarks.yml
Illviljan 743ba62
Update benchmarks.yml
Illviljan 4d7ae6d
Update benchmarks.yml
Illviljan e3de111
Update benchmarks.yml
Illviljan 431cfe6
Update benchmarks.yml
Illviljan 6e73f9b
Update benchmarks.yml
Illviljan 24ca03e
Update benchmarks.yml
Illviljan 5c36e26
remove ccache
Illviljan 919d794
Update benchmarks.yml
Illviljan 91edc08
Try something else than mamba
Illviljan 318e99e
Update benchmarks.yml
Illviljan 6a2b855
Update benchmarks.yml
Illviljan 321a761
test missing again
Illviljan 1eba65c
Update dataarray_missing.py
Illviljan 56556f1
Update dataarray_missing.py
Illviljan 8f08506
Update dataarray_missing.py
Illviljan d1b908a
Update dataarray_missing.py
Illviljan 8f262f9
Update dataarray_missing.py
Illviljan 0b7b1a0
Update dataarray_missing.py
Illviljan 712a453
Update dataarray_missing.py
Illviljan 70cd679
add back tests
Illviljan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
name: Benchmark | ||
|
||
on: | ||
pull_request: | ||
types: [opened, reopened, synchronize, labeled] | ||
workflow_dispatch: | ||
|
||
jobs: | ||
benchmark: | ||
if: | | ||
${{ contains( github.event.pull_request.labels.*.name, 'run-benchmark') | ||
&& github.event_name == 'pull_request' | ||
|| github.event_name == 'workflow_dispatch' }} | ||
name: Linux | ||
runs-on: ubuntu-20.04 | ||
env: | ||
ASV_DIR: "./asv_bench" | ||
|
||
steps: | ||
# We need the full repo to avoid this issue | ||
# https://github.com/actions/checkout/issues/23 | ||
- uses: actions/checkout@v2 | ||
with: | ||
fetch-depth: 0 | ||
|
||
- name: Setup Miniconda | ||
uses: conda-incubator/setup-miniconda@v2 | ||
with: | ||
# installer-url: https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh | ||
installer-url: https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh | ||
|
||
- name: Setup some dependencies | ||
shell: bash -l {0} | ||
run: | | ||
pip install asv | ||
sudo apt-get update -y | ||
|
||
- name: Run benchmarks | ||
shell: bash -l {0} | ||
id: benchmark | ||
env: | ||
OPENBLAS_NUM_THREADS: 1 | ||
MKL_NUM_THREADS: 1 | ||
OMP_NUM_THREADS: 1 | ||
ASV_FACTOR: 1.5 | ||
ASV_SKIP_SLOW: 1 | ||
run: | | ||
set -x | ||
# ID this runner | ||
asv machine --yes | ||
echo "Baseline: ${{ github.event.pull_request.base.sha }} (${{ github.event.pull_request.base.label }})" | ||
echo "Contender: ${GITHUB_SHA} (${{ github.event.pull_request.head.label }})" | ||
# Use mamba for env creation | ||
# export CONDA_EXE=$(which mamba) | ||
export CONDA_EXE=$(which conda) | ||
# Run benchmarks for current commit against base | ||
ASV_OPTIONS="--split --show-stderr --factor $ASV_FACTOR" | ||
asv continuous $ASV_OPTIONS ${{ github.event.pull_request.base.sha }} ${GITHUB_SHA} \ | ||
| sed "/Traceback \|failed$\|PERFORMANCE DECREASED/ s/^/::error::/" \ | ||
| tee benchmarks.log | ||
# Report and export results for subsequent steps | ||
if grep "Traceback \|failed\|PERFORMANCE DECREASED" benchmarks.log > /dev/null ; then | ||
exit 1 | ||
fi | ||
working-directory: ${{ env.ASV_DIR }} | ||
|
||
- name: Add instructions to artifact | ||
if: always() | ||
run: | | ||
cp benchmarks/README_CI.md benchmarks.log .asv/results/ | ||
working-directory: ${{ env.ASV_DIR }} | ||
|
||
- uses: actions/upload-artifact@v2 | ||
if: always() | ||
with: | ||
name: asv-benchmark-results-${{ runner.os }} | ||
path: ${{ env.ASV_DIR }}/.asv/results |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# Benchmark CI | ||
|
||
<!-- Author: @jaimergp --> | ||
<!-- Last updated: 2021.07.06 --> | ||
<!-- Describes the work done as part of https://github.com/scikit-image/scikit-image/pull/5424 --> | ||
|
||
## How it works | ||
|
||
The `asv` suite can be run for any PR on GitHub Actions (check workflow `.github/workflows/benchmarks.yml`) by adding a `run-benchmark` label to said PR. This will trigger a job that will run the benchmarking suite for the current PR head (merged commit) against the PR base (usually `main`). | ||
|
||
We use `asv continuous` to run the job, which runs a relative performance measurement. This means that there's no state to be saved and that regressions are only caught in terms of performance ratio (absolute numbers are available but they are not useful since we do not use stable hardware over time). `asv continuous` will: | ||
|
||
* Compile `scikit-image` for _both_ commits. We use `ccache` to speed up the process, and `mamba` is used to create the build environments. | ||
* Run the benchmark suite for both commits, _twice_ (since `processes=2` by default). | ||
* Generate a report table with performance ratios: | ||
* `ratio=1.0` -> performance didn't change. | ||
* `ratio<1.0` -> PR made it slower. | ||
* `ratio>1.0` -> PR made it faster. | ||
|
||
Due to the sensitivity of the test, we cannot guarantee that false positives are not produced. In practice, values between `(0.7, 1.5)` are to be considered part of the measurement noise. When in doubt, running the benchmark suite one more time will provide more information about the test being a false positive or not. | ||
|
||
## Running the benchmarks on GitHub Actions | ||
|
||
1. On a PR, add the label `run-benchmark`. | ||
2. The CI job will be started. Checks will appear in the usual dashboard panel above the comment box. | ||
3. If more commits are added, the label checks will be grouped with the last commit checks _before_ you added the label. | ||
4. Alternatively, you can always go to the `Actions` tab in the repo and [filter for `workflow:Benchmark`](https://github.com/scikit-image/scikit-image/actions?query=workflow%3ABenchmark). Your username will be assigned to the `actor` field, so you can also filter the results with that if you need it. | ||
|
||
## The artifacts | ||
|
||
The CI job will also generate an artifact. This is the `.asv/results` directory compressed in a zip file. Its contents include: | ||
|
||
* `fv-xxxxx-xx/`. A directory for the machine that ran the suite. It contains three files: | ||
* `<baseline>.json`, `<contender>.json`: the benchmark results for each commit, with stats. | ||
* `machine.json`: details about the hardware. | ||
* `benchmarks.json`: metadata about the current benchmark suite. | ||
* `benchmarks.log`: the CI logs for this run. | ||
* This README. | ||
|
||
## Re-running the analysis | ||
|
||
Although the CI logs should be enough to get an idea of what happened (check the table at the end), one can use `asv` to run the analysis routines again. | ||
|
||
1. Uncompress the artifact contents in the repo, under `.asv/results`. This is, you should see `.asv/results/benchmarks.log`, not `.asv/results/something_else/benchmarks.log`. Write down the machine directory name for later. | ||
2. Run `asv show` to see your available results. You will see something like this: | ||
|
||
``` | ||
$> asv show | ||
|
||
Commits with results: | ||
|
||
Machine : Jaimes-MBP | ||
Environment: conda-py3.9-cython-numpy1.20-scipy | ||
|
||
00875e67 | ||
|
||
Machine : fv-az95-499 | ||
Environment: conda-py3.7-cython-numpy1.17-pooch-scipy | ||
|
||
8db28f02 | ||
3a305096 | ||
``` | ||
|
||
3. We are interested in the commits for `fv-az95-499` (the CI machine for this run). We can compare them with `asv compare` and some extra options. `--sort ratio` will show largest ratios first, instead of alphabetical order. `--split` will produce three tables: improved, worsened, no changes. `--factor 1.5` tells `asv` to only complain if deviations are above a 1.5 ratio. `-m` is used to indicate the machine ID (use the one you wrote down in step 1). Finally, specify your commit hashes: baseline first, then contender! | ||
|
||
``` | ||
$> asv compare --sort ratio --split --factor 1.5 -m fv-az95-499 8db28f02 3a305096 | ||
|
||
Benchmarks that have stayed the same: | ||
|
||
before after ratio | ||
[8db28f02] [3a305096] | ||
<ci-benchmark-check~9^2> | ||
n/a n/a n/a benchmark_restoration.RollingBall.time_rollingball_ndim | ||
1.23±0.04ms 1.37±0.1ms 1.12 benchmark_transform_warp.WarpSuite.time_to_float64(<class 'numpy.float64'>, 128, 3) | ||
5.07±0.1μs 5.59±0.4μs 1.10 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (192, 192, 192), (192, 192, 192)) | ||
1.23±0.02ms 1.33±0.1ms 1.08 benchmark_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 128, 3) | ||
9.45±0.2ms 10.1±0.5ms 1.07 benchmark_rank.Rank3DSuite.time_3d_filters('majority', (32, 32, 32)) | ||
23.0±0.9ms 24.6±1ms 1.07 benchmark_interpolation.InterpolationResize.time_resize((80, 80, 80), 0, 'symmetric', <class 'numpy.float64'>, True) | ||
38.7±1ms 41.1±1ms 1.06 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (2048, 2048), (192, 192, 192)) | ||
4.97±0.2μs 5.24±0.2μs 1.05 benchmark_transform_warp.ResizeLocalMeanSuite.time_resize_local_mean(<class 'numpy.float32'>, (2048, 2048), (2048, 2048)) | ||
4.21±0.2ms 4.42±0.3ms 1.05 benchmark_rank.Rank3DSuite.time_3d_filters('gradient', (32, 32, 32)) | ||
|
||
... | ||
``` | ||
|
||
If you want more details on a specific test, you can use `asv show`. Use `-b pattern` to filter which tests to show, and then specify a commit hash to inspect: | ||
|
||
``` | ||
$> asv show -b time_to_float64 8db28f02 | ||
|
||
Commit: 8db28f02 <ci-benchmark-check~9^2> | ||
|
||
benchmark_transform_warp.WarpSuite.time_to_float64 [fv-az95-499/conda-py3.7-cython-numpy1.17-pooch-scipy] | ||
ok | ||
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============ | ||
-- N / order | ||
--------------- -------------------------------------------------------------------------------------------------------------- | ||
dtype_in 128 / 0 128 / 1 128 / 3 1024 / 0 1024 / 1 1024 / 3 4096 / 0 4096 / 1 4096 / 3 | ||
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============ | ||
numpy.uint8 2.56±0.09ms 523±30μs 1.28±0.05ms 130±3ms 28.7±2ms 81.9±3ms 2.42±0.01s 659±5ms 1.48±0.01s | ||
numpy.uint16 2.48±0.03ms 530±10μs 1.28±0.02ms 130±1ms 30.4±0.7ms 81.1±2ms 2.44±0s 653±3ms 1.47±0.02s | ||
numpy.float32 2.59±0.1ms 518±20μs 1.27±0.01ms 127±3ms 26.6±1ms 74.8±2ms 2.50±0.01s 546±10ms 1.33±0.02s | ||
numpy.float64 2.48±0.04ms 513±50μs 1.23±0.04ms 134±3ms 30.7±2ms 85.4±2ms 2.55±0.01s 632±4ms 1.45±0.01s | ||
=============== ============= ========== ============= ========== ============ ========== ============ ========== ============ | ||
started: 2021-07-06 06:14:36, duration: 1.99m | ||
``` | ||
|
||
## Other details | ||
|
||
### Skipping slow or demanding tests | ||
|
||
To minimize the time required to run the full suite, we trimmed the parameter matrix in some cases and, in others, directly skipped tests that ran for too long or require too much memory. Unlike `pytest`, `asv` does not have a notion of marks. However, you can `raise NotImplementedError` in the setup step to skip a test. In that vein, a new private function is defined at `benchmarks.__init__`: `_skip_slow`. This will check if the `ASV_SKIP_SLOW` environment variable has been defined. If set to `1`, it will raise `NotImplementedError` and skip the test. To implement this behavior in other tests, you can add the following attribute: | ||
|
||
```python | ||
from . import _skip_slow # this function is defined in benchmarks.__init__ | ||
|
||
def time_something_slow(): | ||
pass | ||
|
||
time_something.setup = _skip_slow | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.