[RFC] Make `transforms.functional` methods differential w.r.t. their parameters

This is an RFC that continues the discussion #5000 by @ain-soph and PRs: #4995 and #5110 on updating functional tensor methods from `F.*` to accept learnable parameters (tensors with `requires_grad=True`) and propagating the gradient.

For the motivation and the context, please see https://github.com/pytorch/vision/issues/5000

### Proposal

Torchvision transformations can work on PIL images and torch Tensors and accept scalars, list of scalars as parameters. For example,
```python
x = torch.rand(1, 3, 32, 32)
alpha = 45
center = [1, 2]
out = F.rotate(x, alpha, interpolation=InterpolationMode.BILINEAR, center=center)
# out is tensor
```

The proposal is to be able to learn parameters like `alpha` and `center` using gradients descent:
```diff
x = torch.rand(1, 3, 32, 32)
- alpha = 45
+ alpha = torch.tensor(45.0, requires_grad=True)
- center = [1, 2]
+ center = torch.tensor([1.0, 2.0], requires_grad=True)]
out = F.rotate(x, alpha, interpolation=InterpolationMode.BILINEAR, center=center)
# out is tensor that requires grad
assert out.requires_grad

# parameters can have grads:
out.mean().backward()  # some dummy criterion
assert alpha.grad is not None
assert center.grad is not None
```
and also keep previous API (no BC breaking changes).

### Implementation

In terms of API, it would require updates like:
```diff
def rotate(
    img: Tensor,
-   angle: float,
+   angle: Union[float, int, Tensor],
    interpolation: InterpolationMode = InterpolationMode.NEAREST,
    expand: bool = False,
-   center: Optional[List[int]] = None,
+   center: Optional[Union[List[int], Tuple[int, int], Tensor]] = None,
    fill: Optional[List[float]] = None,
    resample: Optional[int] = None,
) -> Tensor:
```

Note: we need to keep transforms torch jit scriptable and thus we can also be limited by what is supported by torch jit script (simply adding `Union[float, Tensor]` does not always work and can break compatibility with the stable version).

In terms of implementation, we have to ensure that: 
- methods with updated parameters still support all previous data types
- methods are torch jit scriptable
- methods verify that input image is float tensor (no grad propagation otherwise)
- methods propagate grads for tensor inputs <=> all internal ops for tensor branch are propagating grads
- only floating parameters can accept values as Tensors
  - for example, rotate with learnable floating angle
  - IMO, we can't make output (integer) size learnable in resize op (please fix me if there is a way)
  - certain integer parameters can be promoted to float, e.g. translate in affine

Example with affine and rotate ops : #5110

### Transforms to update

- [ ] [normalize](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L320), params: mean and std
- [ ] [adjust_brightness](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L812), params: brightness_factor
- [ ] [adjust_contrast](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L834), params: contrast_factor
- [ ] [adjust_saturation](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L856), params: saturation_factor
- [ ] [adjust_hue](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L878), params: hue_factor
- [ ] [adjust_gamma](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L914), params: gamma, gain
- [ ] [rotate](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L998), params: angle, center, #5110 
- [ ] [affine](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L1078), params: angle, translate, scale, shear, #5110 
- [ ] [gaussian_blur](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L1268), params: kernel_size, sigma ?
- [ ] [posterize](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L1355), params: bits ?
- [ ] [solarize](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L1379), params: threshold
- [ ] [adjust_sharpness](https://github.com/pytorch/vision/blob/e65a857b5487a8493bc8a80a95d64d9f049de347/torchvision/transforms/functional.py#L1399), params: sharpness_factor

Please comment here if I'm missing any op that we could add into the list.








cc @vfdev-5 @datumbox

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Make `transforms.functional` methods differential w.r.t. their parameters #5157

Proposal

Implementation

Transforms to update

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Make transforms.functional methods differential w.r.t. their parameters #5157

Description

Proposal

Implementation

Transforms to update

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Make `transforms.functional` methods differential w.r.t. their parameters #5157