-
Notifications
You must be signed in to change notification settings - Fork 7.1k
revamp prototype features #5283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
revamp prototype features #5283
Conversation
💊 CI failures summary and remediationsAs of commit 1876e85 (more details on the Dr. CI page):
1 failure not recognized by patterns:
This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
@@ -1,185 +0,0 @@ | |||
import functools |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this removed since it only partially aligns with the new design. We can add tests back if the design is more stable.
@@ -267,69 +265,6 @@ def _make_sharded_datapipe(root: str, dataset_size: int) -> IterDataPipe[Dict[st | |||
return dp | |||
|
|||
|
|||
def _read_mutable_buffer_fallback(file: BinaryIO, count: int, item_size: int) -> bytearray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are not removed, but rather moved into torchvision.prototype.utils._internal
since they are used in datasets to read binary files as well as for reading the raw bytes for encoded images for the new features.
CXCYWH = enum.auto() | ||
|
||
|
||
def to_parts(input: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conversion logic will live under the transforms and will be added back in a later PR.
class EncodedImage(EncodedData): | ||
# TODO: Use @functools.cached_property if we can depend on Python 3.8 | ||
@property | ||
def image_size(self) -> Tuple[int, int]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases we need the image size in the datasets to instantiate a bounding box. This is convenient way to probe it, since PIL
only reads the first few bytes to get this information.
return dict() | ||
|
||
@classmethod | ||
def __torch_function__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be added back with the transforms rework.
sequence: List[Any] = [] | ||
for item in obj: | ||
result = apply_recursively(fn, item) | ||
if isinstance(result, collections.abc.Sequence) and hasattr(result, "__inline__"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes no sense yet, but it will on a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmeier , I only took a very brief look. As discussed offline I'll approve to unlock and will give it a more in-depth look later
* revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms
Summary: * revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms Reviewed By: sallysyw Differential Revision: D34265747 fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead
This PR was extracted from a prototype branch that changed multiple things. I tried to separate as much as possible, but there are still multiple changes here. I'll explain the large changes below and leave more comments inline for other changes.
Note, that we are not merging into
main
but into a temporary branch. All lint errors will be fixed when merging the other branch intomain
.Streamline
Feature
implementationThe old implementation was quite convoluted. This patch simplifies this by a lot without hurting flexibility. This is achieved by dropping the
like
keyword from the constructor and have a dedicated.new_like()
method for that.Rework
Label
Currently,
Label
took acategory
parameter to store the human readable name. As explained in #5045 this has a major downside: when stacking multiple labels like in a batch the category does not need to be the same. This PR changes this to have acategories
parameter. which is constant for all labels of the same dataset. In addition, this PR adds a.to_categories()
method that turns a label into the human readable form.Add features for encoded data
This was discussed in #5075 (comment).