Skip to content

Commit a676c2b

Browse files
authored
Update for release (#335)
* Create release workflow and CITATION.cff and update README, setup.py * fix bug in pypy token * fix documentation formatting * TODO for docker image * accept suggestions from shuhei * add further options for disable_file_output documentation * remove from release.yml
1 parent 1e06cce commit a676c2b

File tree

8 files changed

+157
-11
lines changed

8 files changed

+157
-11
lines changed

.github/workflows/release.yml

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: Push to PyPi
2+
3+
on:
4+
push:
5+
branches:
6+
- master
7+
8+
jobs:
9+
test:
10+
runs-on: "ubuntu-latest"
11+
12+
steps:
13+
- name: Checkout source
14+
uses: actions/checkout@v2
15+
16+
- name: Set up Python 3.8
17+
uses: actions/setup-python@v1
18+
with:
19+
python-version: 3.8
20+
21+
- name: Install build dependencies
22+
run: python -m pip install build wheel
23+
24+
- name: Build distributions
25+
shell: bash -l {0}
26+
run: python setup.py sdist bdist_wheel
27+
28+
- name: Publish package to PyPI
29+
if: github.repository == 'automl/Auto-PyTorch' && github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
30+
uses: pypa/gh-action-pypi-publish@master
31+
with:
32+
user: __token__
33+
password: ${{ secrets.pypi_token }}

CITATION.cff

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
preferred-citation:
2+
type: article
3+
authors:
4+
- family-names: "Zimmer"
5+
given-names: "Lucas"
6+
affiliation: "University of Freiburg, Germany"
7+
- family-names: "Lindauer"
8+
given-names: "Marius"
9+
affiliation: "University of Freiburg, Germany"
10+
- family-names: "Hutter"
11+
given-names: "Frank"
12+
affiliation: "University of Freiburg, Germany"
13+
doi: "10.1109/TPAMI.2021.3067763"
14+
journal-title: "IEEE Transactions on Pattern Analysis and Machine Intelligence"
15+
title: "Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"
16+
year: 2021
17+
note: "also available under https://arxiv.org/abs/2006.13799"
18+
start: 3079
19+
end: 3090

README.md

Lines changed: 52 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Auto-PyTorch
22

3-
Copyright (C) 2019 [AutoML Group Freiburg](http://www.automl.org/)
3+
Copyright (C) 2021 [AutoML Groups Freiburg and Hannover](http://www.automl.org/)
44

5-
This an alpha version of Auto-PyTorch with improved API.
6-
So far, Auto-PyTorch supports tabular data (classification, regression).
7-
We plan to enable image data and time-series data.
5+
While early AutoML frameworks focused on optimizing traditional ML pipelines and their hyperparameters, another trend in AutoML is to focus on neural architecture search. To bring the best of these two worlds together, we developed **Auto-PyTorch**, which jointly and robustly optimizes the network architecture and the training hyperparameters to enable fully automated deep learning (AutoDL).
86

7+
Auto-PyTorch is mainly developed to support tabular data (classification, regression).
8+
The newest features in Auto-PyTorch for tabular data are described in the paper ["Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL"](https://arxiv.org/abs/2006.13799) (see below for bibtex ref).
99

10-
Find the documentation [here](https://automl.github.io/Auto-PyTorch/development)
11-
10+
***From v0.1.0, AutoPyTorch has been updated to further improve usability, robustness and efficiency by using SMAC as the underlying optimization package as well as changing the code structure. Therefore, moving from v0.0.2 to v0.1.0 will break compatibility.
11+
In case you would like to use the old API, you can find it at [`master_old`](https://github.com/automl/Auto-PyTorch/tree/master-old).***
1212

1313
## Installation
1414

@@ -33,6 +33,50 @@ python setup.py install
3333

3434
```
3535

36+
## Examples
37+
38+
In a nutshell:
39+
40+
```py
41+
from autoPyTorch.api.tabular_classification import TabularClassificationTask
42+
43+
# data and metric imports
44+
import sklearn.model_selection
45+
import sklearn.datasets
46+
import sklearn.metrics
47+
X, y = sklearn.datasets.load_digits(return_X_y=True)
48+
X_train, X_test, y_train, y_test = \
49+
sklearn.model_selection.train_test_split(X, y, random_state=1)
50+
51+
# initialise Auto-PyTorch api
52+
api = TabularClassificationTask()
53+
54+
# Search for an ensemble of machine learning algorithms
55+
api.search(
56+
X_train=X_train,
57+
y_train=y_train,
58+
X_test=X_test,
59+
y_test=y_test,
60+
optimize_metric='accuracy',
61+
total_walltime_limit=300,
62+
func_eval_time_limit_secs=50
63+
)
64+
65+
# Calculate test accuracy
66+
y_pred = api.predict(X_test)
67+
score = api.score(y_pred, y_test)
68+
print("Accuracy score", score)
69+
```
70+
71+
For more examples including customising the search space, parellising the code, etc, checkout the `examples` folder
72+
73+
```sh
74+
$ cd examples/
75+
```
76+
77+
78+
Code for the [paper](https://arxiv.org/abs/2006.13799) is available under `examples/ensemble` in the [TPAMI.2021.3067763](https://github.com/automl/Auto-PyTorch/tree/TPAMI.2021.3067763`) branch.
79+
3680
## Contributing
3781

3882
If you want to contribute to Auto-PyTorch, clone the repository and checkout our current development branch
@@ -63,8 +107,8 @@ Please refer to the branch `TPAMI.2021.3067763` to reproduce the paper *Auto-PyT
63107
title = {Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL},
64108
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
65109
year = {2021},
66-
note = {IEEE early access; also available under https://arxiv.org/abs/2006.13799},
67-
pages = {1-12}
110+
note = {also available under https://arxiv.org/abs/2006.13799},
111+
pages = {3079 - 3090}
68112
}
69113
```
70114

autoPyTorch/api/base_task.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -762,6 +762,7 @@ def _search(
762762
budget_type (str):
763763
Type of budget to be used when fitting the pipeline.
764764
It can be one of:
765+
765766
+ `epochs`: The training of each pipeline will be terminated after
766767
a number of epochs have passed. This number of epochs is determined by the
767768
budget argument of this method.
@@ -840,6 +841,21 @@ def _search(
840841
Numeric precision used when loading ensemble data.
841842
Can be either '16', '32' or '64'.
842843
disable_file_output (Union[bool, List]):
844+
If True, disable model and prediction output.
845+
Can also be used as a list to pass more fine-grained
846+
information on what to save. Allowed elements in the list are:
847+
848+
+ `y_optimization`:
849+
do not save the predictions for the optimization set,
850+
which would later on be used to build an ensemble. Note that SMAC
851+
optimizes a metric evaluated on the optimization set.
852+
+ `pipeline`:
853+
do not save any individual pipeline files
854+
+ `pipelines`:
855+
In case of cross validation, disables saving the joint model of the
856+
pipelines fit on each fold.
857+
+ `y_test`:
858+
do not save the predictions for the test set.
843859
load_models (bool: default=True):
844860
Whether to load the models after fitting AutoPyTorch.
845861
portfolio_selection (Optional[str]):

autoPyTorch/api/tabular_classification.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,7 @@ def search(
159159
budget_type (str):
160160
Type of budget to be used when fitting the pipeline.
161161
It can be one of:
162+
162163
+ `epochs`: The training of each pipeline will be terminated after
163164
a number of epochs have passed. This number of epochs is determined by the
164165
budget argument of this method.
@@ -237,6 +238,21 @@ def search(
237238
Numeric precision used when loading ensemble data.
238239
Can be either '16', '32' or '64'.
239240
disable_file_output (Union[bool, List]):
241+
If True, disable model and prediction output.
242+
Can also be used as a list to pass more fine-grained
243+
information on what to save. Allowed elements in the list are:
244+
245+
+ `y_optimization`:
246+
do not save the predictions for the optimization set,
247+
which would later on be used to build an ensemble. Note that SMAC
248+
optimizes a metric evaluated on the optimization set.
249+
+ `pipeline`:
250+
do not save any individual pipeline files
251+
+ `pipelines`:
252+
In case of cross validation, disables saving the joint model of the
253+
pipelines fit on each fold.
254+
+ `y_test`:
255+
do not save the predictions for the test set.
240256
load_models (bool: default=True):
241257
Whether to load the models after fitting AutoPyTorch.
242258
portfolio_selection (Optional[str]):

autoPyTorch/api/tabular_regression.py

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,7 @@ def search(
160160
budget_type (str):
161161
Type of budget to be used when fitting the pipeline.
162162
It can be one of:
163+
163164
+ `epochs`: The training of each pipeline will be terminated after
164165
a number of epochs have passed. This number of epochs is determined by the
165166
budget argument of this method.
@@ -173,15 +174,15 @@ def search(
173174
is used, min_budget will refer to epochs whereas if budget_type=='runtime' then
174175
min_budget will refer to seconds.
175176
min_budget (int):
176-
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>_` to
177+
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>`_ to
177178
trade-off resources between running many pipelines at min_budget and
178179
running the top performing pipelines on max_budget.
179180
min_budget states the minimum resource allocation a pipeline should have
180181
so that we can compare and quickly discard bad performing models.
181182
For example, if the budget_type is epochs, and min_budget=5, then we will
182183
run every pipeline to a minimum of 5 epochs before performance comparison.
183184
max_budget (int):
184-
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>_` to
185+
Auto-PyTorch uses `Hyperband <https://arxiv.org/abs/1603.06560>`_ to
185186
trade-off resources between running many pipelines at min_budget and
186187
running the top performing pipelines on max_budget.
187188
max_budget states the maximum resource allocation a pipeline is going to
@@ -238,6 +239,21 @@ def search(
238239
Numeric precision used when loading ensemble data.
239240
Can be either '16', '32' or '64'.
240241
disable_file_output (Union[bool, List]):
242+
If True, disable model and prediction output.
243+
Can also be used as a list to pass more fine-grained
244+
information on what to save. Allowed elements in the list are:
245+
246+
+ `y_optimization`:
247+
do not save the predictions for the optimization set,
248+
which would later on be used to build an ensemble. Note that SMAC
249+
optimizes a metric evaluated on the optimization set.
250+
+ `pipeline`:
251+
do not save any individual pipeline files
252+
+ `pipelines`:
253+
In case of cross validation, disables saving the joint model of the
254+
pipelines fit on each fold.
255+
+ `y_test`:
256+
do not save the predictions for the test set.
241257
load_models (bool: default=True):
242258
Whether to load the models after fitting AutoPyTorch.
243259
portfolio_selection (Optional[str]):

docs/extending.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,5 @@
55
======================
66
Extending Auto-PyTorch
77
======================
8+
9+
TODO

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
name="autoPyTorch",
2424
version="0.1.0",
2525
author="AutoML Freiburg",
26-
author_email="[email protected]",
26+
author_email="[email protected]",
2727
description=("Auto-PyTorch searches neural architectures using smac"),
2828
long_description=long_description,
2929
url="https://github.com/automl/Auto-PyTorch",

0 commit comments

Comments
 (0)