This repository was archived by the owner on Sep 11, 2023. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 6
Implement a thin "data loading" layer to help ML training #97
Labels
data
New data source or feature; or modification of existing data source
enhancement
New feature or request
Comments
This was referenced Sep 7, 2021
jacobbieker
added a commit
that referenced
this issue
Sep 14, 2021
jacobbieker
added a commit
that referenced
this issue
Sep 14, 2021
jacobbieker
added a commit
that referenced
this issue
Sep 17, 2021
* Add customizable required keys * Add customizable required keys * Add TODO ideas relating to #97 * Start on subsetting data * Add customizable required keys * Add customizable required keys * Add TODO ideas relating to #97 * Start on subsetting data * Readd required keys * Run black formatting * Add init * Add subsetting temporal data * Rename to better reflect current index * Add some higher imports * Add Example * Fix circular import, subset time sin/cos etc. * Remove duplicated file * Update constants * Run black * Remove todo * Split out subselecting into own function * Move datetime feature names to required_keys * Add check for required keys * Add unittest for subselect_data * Update docstring * Add docstring * Update version * Import more constants * Add 30 second explanation * Remove extra checks, PR comment * Update subselect_data for xarray * Update with simpler version * Passing test * Make test shorter, add test file * Update nowcasting_dataset/dataset/datasets.py Co-authored-by: Jack Kelly <[email protected]> * Update nowcasting_dataset/dataset/datasets.py Co-authored-by: Jack Kelly <[email protected]> * Update nowcasting_dataset/dataset/datasets.py Co-authored-by: Jack Kelly <[email protected]> * Reduce code duplication in subselect * Change how Datetimes selected * Fix positional arg * Simplify a bit further * Simplify a bit further * Fix error Co-authored-by: Jack Kelly <[email protected]>
38 tasks
This is implemented by |
Repository owner
moved this from Todo
to Done
in Nowcasting
Oct 22, 2021
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
data
New data source or feature; or modification of existing data source
enhancement
New feature or request
Uh oh!
There was an error while loading. Please reload this page.
The bulk of
nowcasting_dataset
is about saving pre-prepared batches of data to disk. But it feels like there's perhaps a need for a set of simple tools to help loading those pre-prepared batches into ML models during training. This has come up a few times in other issues:Tasks like subsetting the data should be done as upstream as possible, so we only load from disk the data we want.
Tasks like data augmentation could perhaps be done in PyTorch 'transforms'?
The text was updated successfully, but these errors were encountered: