Skip to content

fix prototype datasets data loading tests #5711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 5, 2022
7 changes: 3 additions & 4 deletions .circleci/config.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 3 additions & 4 deletions .circleci/config.yml.in
Original file line number Diff line number Diff line change
Expand Up @@ -152,11 +152,10 @@ commands:
args: --no-build-isolation <<# parameters.editable >> --editable <</ parameters.editable >> .
descr: Install torchvision <<# parameters.editable >> in editable mode <</ parameters.editable >>

# Installs all extra dependencies that are needed in the torchvision.prototype namespace, but are not tracked in the
# project requirements.
install_prototype_dependencies:
steps:
- pip_install:
args: iopath
descr: Install third-party dependencies
- pip_install:
args: --pre torchdata --extra-index-url https://download.pytorch.org/whl/nightly/cpu
descr: Install torchdata from nightly releases
Expand Down Expand Up @@ -366,7 +365,7 @@ jobs:
- install_torchvision
- install_prototype_dependencies
- pip_install:
args: scipy pycocotools h5py
args: scipy pycocotools h5py dill
descr: Install optional dependencies
- run_tests_selective:
file_or_dir: test/test_prototype_*.py
Expand Down
17 changes: 13 additions & 4 deletions test/test_prototype_builtin_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import torch
from builtin_dataset_mocks import parametrize_dataset_mocks, DATASET_MOCKS
from torch.testing._comparison import assert_equal, TensorLikePair, ObjectPair
from torch.utils.data._utils.serialization import DILL_AVAILABLE
from torch.utils.data.graph import traverse
from torch.utils.data.graph_settings import get_all_graph_pipes
from torchdata.datapipes.iter import Shuffler, ShardingFilter
Expand Down Expand Up @@ -109,19 +110,27 @@ def test_transformable(self, test_home, dataset_mock, config):

next(iter(dataset.map(transforms.Identity())))

@pytest.mark.xfail(reason="See https://github.com/pytorch/data/issues/237")
@parametrize_dataset_mocks(DATASET_MOCKS)
def test_serializable(self, test_home, dataset_mock, config):
def test_serializable_pickle(self, test_home, dataset_mock, config):
dataset_mock.prepare(test_home, config)

dataset = datasets.load(dataset_mock.name, **config)

pickle.dumps(dataset)

@pytest.mark.skipif(not DILL_AVAILABLE, reason="Package `dill` is not available.")
# TODO: remove this as soon as dill is fully supported
@pytest.mark.xfail(reason="See https://github.com/pytorch/data/issues/237")
def test_serializable_dill(self, test_home, dataset_mock, config):
import dill

dataset_mock.prepare(test_home, config)
dataset = datasets.load(dataset_mock.name, **config)

dill.dumps(dataset)

# TODO: we need to enforce not only that both a Shuffler and a ShardingFilter are part of the datapipe, but also
# that the Shuffler comes before the ShardingFilter. Early commits in https://github.com/pytorch/vision/pull/5680
# contain a custom test for that, but we opted to wait for a potential solution / test from torchdata for now.
@pytest.mark.xfail(reason="See https://github.com/pytorch/data/issues/237")
@parametrize_dataset_mocks(DATASET_MOCKS)
@pytest.mark.parametrize("annotation_dp_type", (Shuffler, ShardingFilter))
def test_has_annotations(self, test_home, dataset_mock, config, annotation_dp_type):
Expand Down