Proposal: split train.py into train.py and train_aml.py #219

jotaylo · 2020-03-02T22:00:46Z

This change splits train.py into two files.

The new train.py is standalone, and has no references to AzureML. It defines two functions, split_data to split a dataframe into test/train data, and train_model which takes the test/train data and a parameter object and trains the model and returns related metrics. The script can be run locally, in which case it loads a dataset from a file.

The second file, train_aml.py contains reasonably general AzureML logic. It reads data from a dataset, then calls the split_data function from train.py. It loads input parameters from a config file and logs them, then calls train_model from train.py. It then uploads the model and logs any metrics returned by train_model.

The hope with these changes is to demonstrate a simple interface for integrating an existing ML script with MLOpsPython, as well as providing an example for how the core ML functionality can be invoked in multiple ways for development purposes.

dtzar

LGTM - I'd probably wait to merge this until you have the Juypter notebook equivalent of train_aml.py flushed out in conjunction with this.

split train.py into train.py and train_aml.py

1c6fea2

dtzar approved these changes Mar 2, 2020

View reviewed changes

eedorenko self-requested a review March 2, 2020 23:05

Merge branch 'master' into jotaylo/split_train_script

3df5183

eedorenko approved these changes Mar 3, 2020

View reviewed changes

tcare mentioned this pull request Mar 3, 2020

Getting started refactor #216

Merged

jotaylo and others added 5 commits March 3, 2020 12:05

split train_model into two functions

9cbb4aa

make unit tests more self contained

b3ed909

Added Experiment and Pipeline notebooks

b063abc

rename and clean up experimentation notebooks

b6b6144

Merge branch 'master' into jotaylo/split_train_script

25467ff

jotaylo merged commit 39609ae into master Mar 5, 2020

jotaylo deleted the jotaylo/split_train_script branch March 5, 2020 01:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: split train.py into train.py and train_aml.py #219

Proposal: split train.py into train.py and train_aml.py #219

Uh oh!

jotaylo commented Mar 2, 2020

Uh oh!

dtzar left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Proposal: split train.py into train.py and train_aml.py #219

Proposal: split train.py into train.py and train_aml.py #219

Uh oh!

Conversation

jotaylo commented Mar 2, 2020

Uh oh!

dtzar left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants