Download dataset scripts FIX + new option to download datasets from benchmark configs #129
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR includes changes to the
datasets/load_datasets.py
script and related documentation.In more details: the previous version of the script had a bug that completely ignored any dataset name provided to the script via the
-d
option. (It downloaded all the datasets instead).This has been fixed in this PR, along with some improvement to internal documentation (i.e. help message).
Moreover, a new and extra option has been also added to the script, namely
-c
,--configs
.This options override the manual selection of dataset to download by automatically extracting the names of required datasets from input configuration file(s).
This becomes particularly useful when preparing to run multiple benchmark experiments, downloading all the necessary datasets used.
A new
README.md
file has been added in thedatasets
package, inline with other packages included in the benchmark.This documentation file includes all the detailed instructions on how to run and use the
load_datasets
utility.Also, a new section has been added to the main
README.md
file to highlight the new features.