port transforms from the old to the new API

We are currently in the process of revamping the `transforms` module, but there is still a lot of porting left. The design is now stable enough that the porting process should be manageable for someone not intimately familiar with the new design. Thus, we are actively looking for contributions helping us finishing this. 

Here is the list of transformations that still need to be ported to achieve feature parity (minus some special transformations) between the new and old API:

- [x] #5521
- [x] #5522
- [x] #5523
- [x] #5524
- [x] #5525
- [x] #5526
- [x] #5527
- [x] #5528
- [x] #5529
- [x] #5530
- [x] #5531
- [x] #5532
- [x] #5533
- [x] #5534
- [x] #5535
- [x] #5536
- [x] #5537
- [x] #5538
- [x] `RandomHorizontalFlip` #5421
- [x] `Resize` #5421
- [x] `CenterCrop` #5421
- [x] `RandomResizedCrop` #5421
- [x] `Normalize` #5421
- [x] `RandomErasing` #5421
- AutoAugment transforms
  - [x] `RandAugment` #5421
  - [x] `TrivialAugmentWide` #5421
  - [x] `AutoAugment` #5421
  - [x] #5492

There is an issue for each transformation. Please comment on that if you want to take up a task so we can assign it to you.

---

Here is a recipe on how the porting process looks like:

1. Port the kernels from `transforms.functional_tensor` and `transforms.functional_pil` to `prototype.transforms.functional`. In most cases this means just binding them to a new name

    https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/functional/_geometry.py#L14-L15
    
    The naming scheme is `{kernel_name}_{feature_type}_{feature_subtype}`. To use the example above, the kernel name is `horizontal_flip`, the feature type is `image` and the subtypes are `tensor` and `pil`.
    
    In general, the new kernels should have the same signature as the old dispatchers from `torchvision.functional`. In most cases this is given by default, but sometimes there is some common pre-processing performed in the dispatchers before the kernels are called. In these cases, the canonical way is to [move the common functionality into a private helper function](https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/functional/_geometry.py#L235-L258) and [define the new kernels to call the helper first and afterwards the old kernel](https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/functional/_geometry.py#L261-L290).
2. Create a new transform in `prototype.transforms` that inherits from `prototype.transforms.Transform`. The constructor can be copy-pasted from the corresponding transform in `transforms`.
3. Implement the `_transform` method. It receives two arguments: `input` and `params` (see below). `input` can be any non-container objects, i.e. no lists, tuple, dictionaries, and so on, so the implementation needs to check the input type and dispatch accordingly. The general behavior should be to handle what we can and let the rest pass through (there are exceptions to this, see below). The implementation should look something like this

    ```py
    class Foo(Transform):
        def _transform(self, input: Any, params: Dict[str, Any]) -> Any:
            if isinstance(input, features.Image):
                output = F.foo_image_tensor(input, ...)
                return features.Image.new_like(input, output)
            elif is_simple_tensor(input):
                return F.foo_image_tensor(input, ...)
            elif isinstance(input, PIL.Image.Image):
                return F.foo_image_pil(input, ...)
            else:
                return input
    ```
4. Some transformations could in theory support other feature types such as bounding boxes, but we currently don't have kernels for them or will never have. In these cases it is crucial to not perform the transformation on the image and let the rest pass through, because it invalidates the correspondence. For example, applying `RandomRotate` only on an image but ignoring a bounding box renders the bounding box invalid. To avoid this, overwrite the `forward` method and fail if unsupported types are detected:

    https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/_geometry.py#L74-L78
5. Some random transformations take a `p` parameter that indicates the probability to apply the transformation. Overwrite the `forward` method and perform this check there before calling the `forward` of the super class

    https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/_augment.py#L102-L105
6. Some random transformations need to sample parameters at runtime. In the old implementations this is usually done in a 

    ```py
    @staticmethod
    def get_params():
       ...
    ```

    The new architecture is similar but not the same. You can overwrite the `_get_params(self, sample) -> Dict[str, Any]` method:

    https://github.com/pytorch/vision/blob/97385df0f675b805d974994389a6ea2f0de51a19/torchvision/prototype/transforms/_geometry.py#L114-L116

    The returned dictionary is available through the `params` parameter in the `_transform` method. The `query_image` function used above can be used to find an image in the `sample` without worrying about the actual structure. This is useful if the image dimensions are needed to generate the parameters.

The transformations are not simple stuff, so there might be cases were the recipe from above is not sufficient. If you have any kind of questions or hit blockers, feel free to send a PR with what you have and ping me there so I can have a look and help you out.

cc @vfdev-5 @datumbox @bjuncek

	def forward(self, *inputs: Any) -> Any:
	sample = inputs if len(inputs) > 1 else inputs[0]
	if has_any(sample, features.BoundingBox, features.SegmentationMask):
	raise TypeError(f"BoundingBox'es and SegmentationMask's are not supported by {type(self).__name__}()")
	return super().forward(sample)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

port transforms from the old to the new API #5520

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	horizontal_flip_image_tensor = _FT.hflip
	horizontal_flip_image_pil = _FP.hflip

	elif torch.rand(1) >= self.p:
	return sample

	return super().forward(sample)

	def _get_params(self, sample: Any) -> Dict[str, Any]:
	image = query_image(sample)
	_, height, width = get_image_dimensions(image)

port transforms from the old to the new API #5520

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions