revamp prototype features #5283

pmeier · 2022-01-26T13:51:39Z

This PR was extracted from a prototype branch that changed multiple things. I tried to separate as much as possible, but there are still multiple changes here. I'll explain the large changes below and leave more comments inline for other changes.

Note, that we are not merging into main but into a temporary branch. All lint errors will be fixed when merging the other branch into main.

Streamline `Feature` implementation

The old implementation was quite convoluted. This patch simplifies this by a lot without hurting flexibility. This is achieved by dropping the like keyword from the constructor and have a dedicated .new_like() method for that.

Rework `Label`

Currently, Label took a category parameter to store the human readable name. As explained in #5045 this has a major downside: when stacking multiple labels like in a batch the category does not need to be the same. This PR changes this to have a categories parameter. which is constant for all labels of the same dataset. In addition, this PR adds a .to_categories() method that turns a label into the human readable form.

Add features for encoded data

This was discussed in #5075 (comment).

facebook-github-bot · 2022-01-26T13:51:46Z

💊 CI failures summary and remediations

As of commit 1876e85 (more details on the Dr. CI page):

1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job	Step	Action
^{lint_python_and_config}	^{Lint Python code and config files}	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

pmeier · 2022-01-26T13:52:59Z

test/test_prototype_features.py

@@ -1,185 +0,0 @@
-import functools


As this removed since it only partially aligns with the new design. We can add tests back if the design is more stable.

pmeier · 2022-01-26T13:55:47Z

torchvision/prototype/datasets/utils/_internal.py

@@ -267,69 +265,6 @@ def _make_sharded_datapipe(root: str, dataset_size: int) -> IterDataPipe[Dict[st
    return dp


-def _read_mutable_buffer_fallback(file: BinaryIO, count: int, item_size: int) -> bytearray:


These functions are not removed, but rather moved into torchvision.prototype.utils._internal since they are used in datasets to read binary files as well as for reading the raw bytes for encoded images for the new features.

pmeier · 2022-01-26T13:57:00Z

torchvision/prototype/features/_bounding_box.py

-    CXCYWH = enum.auto()
-
-
-def to_parts(input: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:


The conversion logic will live under the transforms and will be added back in a later PR.

pmeier · 2022-01-26T13:59:28Z

torchvision/prototype/features/_encoded.py

+class EncodedImage(EncodedData):
+    # TODO: Use @functools.cached_property if we can depend on Python 3.8
+    @property
+    def image_size(self) -> Tuple[int, int]:


In some cases we need the image size in the datasets to instantiate a bounding box. This is convenient way to probe it, since PIL only reads the first few bytes to get this information.

pmeier · 2022-01-26T14:00:42Z

torchvision/prototype/features/_feature.py

-        return dict()
-
-    @classmethod
-    def __torch_function__(


This will be added back with the transforms rework.

pmeier · 2022-01-26T14:02:17Z

torchvision/prototype/utils/_internal.py

+        sequence: List[Any] = []
+        for item in obj:
+            result = apply_recursively(fn, item)
+            if isinstance(result, collections.abc.Sequence) and hasattr(result, "__inline__"):


This makes no sense yet, but it will on a later PR.

NicolasHug

Thanks @pmeier , I only took a very brief look. As discussed offline I'll approve to unlock and will give it a more in-depth look later

* revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms

Summary: * revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms Reviewed By: sallysyw Differential Revision: D34265747 fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead

revamp prototype features

1876e85

pmeier added module: datasets prototype labels Jan 26, 2022

pytorch-bot bot added the ciflow/default label Jan 26, 2022

facebook-github-bot added the cla signed label Jan 26, 2022

pmeier commented Jan 26, 2022

View reviewed changes

pmeier requested a review from NicolasHug January 26, 2022 14:04

NicolasHug approved these changes Jan 26, 2022

View reviewed changes

pmeier merged commit bfc8510 into pytorch:revamp-prototype-features-transforms Jan 26, 2022

pmeier deleted the revamp-prototype-features branch January 26, 2022 14:36

pmeier mentioned this pull request Jan 26, 2022

remove decoding from prototype datasets #5287

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

revamp prototype features #5283

revamp prototype features #5283

Uh oh!

pmeier commented Jan 26, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jan 26, 2022 •

edited

Loading

Uh oh!

pmeier Jan 26, 2022

Uh oh!

pmeier Jan 26, 2022

Uh oh!

pmeier Jan 26, 2022

Uh oh!

pmeier Jan 26, 2022

Uh oh!

pmeier Jan 26, 2022

Uh oh!

pmeier Jan 26, 2022

Uh oh!

NicolasHug left a comment

Uh oh!

Uh oh!

		@@ -267,69 +265,6 @@ def _make_sharded_datapipe(root: str, dataset_size: int) -> IterDataPipe[Dict[st
		return dp


		def _read_mutable_buffer_fallback(file: BinaryIO, count: int, item_size: int) -> bytearray:

		CXCYWH = enum.auto()


		def to_parts(input: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:

revamp prototype features #5283

revamp prototype features #5283

Uh oh!

Conversation

pmeier commented Jan 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Streamline Feature implementation

Rework Label

Add features for encoded data

Uh oh!

facebook-github-bot commented Jan 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

1 failure not recognized by patterns:

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

pmeier Jan 26, 2022

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pmeier commented Jan 26, 2022 •

edited

Loading

Streamline `Feature` implementation

Rework `Label`

facebook-github-bot commented Jan 26, 2022 •

edited

Loading