Skip to content

revamp prototype features #5283

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Jan 26, 2022

This PR was extracted from a prototype branch that changed multiple things. I tried to separate as much as possible, but there are still multiple changes here. I'll explain the large changes below and leave more comments inline for other changes.

Note, that we are not merging into main but into a temporary branch. All lint errors will be fixed when merging the other branch into main.


Streamline Feature implementation

The old implementation was quite convoluted. This patch simplifies this by a lot without hurting flexibility. This is achieved by dropping the like keyword from the constructor and have a dedicated .new_like() method for that.

Rework Label

Currently, Label took a category parameter to store the human readable name. As explained in #5045 this has a major downside: when stacking multiple labels like in a batch the category does not need to be the same. This PR changes this to have a categories parameter. which is constant for all labels of the same dataset. In addition, this PR adds a .to_categories() method that turns a label into the human readable form.

Add features for encoded data

This was discussed in #5075 (comment).

@facebook-github-bot
Copy link

facebook-github-bot commented Jan 26, 2022

💊 CI failures summary and remediations

As of commit 1876e85 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

1 failure not recognized by patterns:

Job Step Action
CircleCI lint_python_and_config Lint Python code and config files 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@@ -1,185 +0,0 @@
import functools
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this removed since it only partially aligns with the new design. We can add tests back if the design is more stable.

@@ -267,69 +265,6 @@ def _make_sharded_datapipe(root: str, dataset_size: int) -> IterDataPipe[Dict[st
return dp


def _read_mutable_buffer_fallback(file: BinaryIO, count: int, item_size: int) -> bytearray:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are not removed, but rather moved into torchvision.prototype.utils._internal since they are used in datasets to read binary files as well as for reading the raw bytes for encoded images for the new features.

CXCYWH = enum.auto()


def to_parts(input: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion logic will live under the transforms and will be added back in a later PR.

class EncodedImage(EncodedData):
# TODO: Use @functools.cached_property if we can depend on Python 3.8
@property
def image_size(self) -> Tuple[int, int]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases we need the image size in the datasets to instantiate a bounding box. This is convenient way to probe it, since PIL only reads the first few bytes to get this information.

return dict()

@classmethod
def __torch_function__(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be added back with the transforms rework.

sequence: List[Any] = []
for item in obj:
result = apply_recursively(fn, item)
if isinstance(result, collections.abc.Sequence) and hasattr(result, "__inline__"):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes no sense yet, but it will on a later PR.

@pmeier pmeier requested a review from NicolasHug January 26, 2022 14:04
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pmeier , I only took a very brief look. As discussed offline I'll approve to unlock and will give it a more in-depth look later

@pmeier pmeier merged commit bfc8510 into pytorch:revamp-prototype-features-transforms Jan 26, 2022
@pmeier pmeier deleted the revamp-prototype-features branch January 26, 2022 14:36
datumbox pushed a commit that referenced this pull request Feb 11, 2022
* revamp prototype features (#5283)

* remove decoding from prototype datasets (#5287)

* remove decoder from prototype datasets

* remove unused imports

* cleanup

* fix readme

* use OneHotLabel in SEMEION

* improve voc implementation

* revert unrelated changes

* fix semeion mock data

* fix pcam

* readd functional transforms API to prototype (#5295)

* readd functional transforms

* cleanup

* add missing imports

* remove __torch_function__ dispatch

* readd repr

* readd empty line

* add test for scriptability

* remove function copy

* change import from functional tensor transforms to just functional

* fix import

* fix test

* fix prototype features and functional transforms after review (#5377)

* fix prototype functional transforms after review

* address features review

* make mypy more strict on prototype features

* make mypy more strict for prototype transforms

* fix annotation

* fix kernel tests

* add automatic feature type dispatch to functional transforms (#5323)

* add auto dispatch

* fix missing arguments error message

* remove pil kernel for erase

* automate feature specific parameter detection

* fix typos

* cleanup dispatcher call

* remove __torch_function__ from transform dispatch

* remove auto-generation

* revert unrelated changes

* remove implements decorator

* change register parameter order

* change order of transforms for readability

* add documentation for __torch_function__

* fix mypy

* inline check for support

* refactor kernel registering process

* refactor dispatch to be a regular decorator

* split kernels and dispatchers

* remove sentinels

* replace pass with ...

* appease mypy

* make single kernel dispatchers more concise

* make dispatcher signatures more generic

* make kernel checking more strict

* revert doc changes

* address Franciscos comments

* remove inplace

* rename kernel test module

* fix inplace

* remove special casing for pil and vanilla tensors

* address comments

* update docs

* cleanup features / transforms feature branch (#5406)

* mark candidates for removal

* align signature of resize_bounding_box with corresponding image kernel

* fix documentation of Feature

* remove interpolation mode and antialias option from resize_segmentation_mask

* remove or privatize functionality in features / datasets / transforms
facebook-github-bot pushed a commit that referenced this pull request Feb 16, 2022
Summary:
* revamp prototype features (#5283)

* remove decoding from prototype datasets (#5287)

* remove decoder from prototype datasets

* remove unused imports

* cleanup

* fix readme

* use OneHotLabel in SEMEION

* improve voc implementation

* revert unrelated changes

* fix semeion mock data

* fix pcam

* readd functional transforms API to prototype (#5295)

* readd functional transforms

* cleanup

* add missing imports

* remove __torch_function__ dispatch

* readd repr

* readd empty line

* add test for scriptability

* remove function copy

* change import from functional tensor transforms to just functional

* fix import

* fix test

* fix prototype features and functional transforms after review (#5377)

* fix prototype functional transforms after review

* address features review

* make mypy more strict on prototype features

* make mypy more strict for prototype transforms

* fix annotation

* fix kernel tests

* add automatic feature type dispatch to functional transforms (#5323)

* add auto dispatch

* fix missing arguments error message

* remove pil kernel for erase

* automate feature specific parameter detection

* fix typos

* cleanup dispatcher call

* remove __torch_function__ from transform dispatch

* remove auto-generation

* revert unrelated changes

* remove implements decorator

* change register parameter order

* change order of transforms for readability

* add documentation for __torch_function__

* fix mypy

* inline check for support

* refactor kernel registering process

* refactor dispatch to be a regular decorator

* split kernels and dispatchers

* remove sentinels

* replace pass with ...

* appease mypy

* make single kernel dispatchers more concise

* make dispatcher signatures more generic

* make kernel checking more strict

* revert doc changes

* address Franciscos comments

* remove inplace

* rename kernel test module

* fix inplace

* remove special casing for pil and vanilla tensors

* address comments

* update docs

* cleanup features / transforms feature branch (#5406)

* mark candidates for removal

* align signature of resize_bounding_box with corresponding image kernel

* fix documentation of Feature

* remove interpolation mode and antialias option from resize_segmentation_mask

* remove or privatize functionality in features / datasets / transforms

Reviewed By: sallysyw

Differential Revision: D34265747

fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants