[feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia? or colorsys python core module?) #4029

vadimkantorov · 2021-06-09T18:36:44Z

It would be nice to have them available in the standard library (both on CPU/GPU) rather than rolling your own. Kornia seems to have implemented it: https://kornia.readthedocs.io/en/latest/color.html, skimage as well: https://scikit-image.org/docs/dev/api/skimage.color.html, python core as well: https://docs.python.org/3/library/colorsys.html, but I think it's basic enough to merit inclusion of similar functions in core.

(my usecase: SLIC superpixel extraction which requires rgb2lab and selective search algorithm which requires rgb2hsv and rgb2lab)

cc @vfdev-5

datumbox · 2021-06-10T09:09:35Z

Thanks @vadimkantorov. This is useful feedback as we are currently in discussions to improve our transforms.

vadimkantorov · 2021-06-10T11:24:54Z

Maybe this is not even related to transforms per se, since useful also in non-augmentation contexts as well.

For transforms, I really hope, there is more native batched transforms, clear separation between function and random-sampling its parameters, and also that there is an easy way to do inverse transforms (where possible). Also getting padding-masks for direct and inverse transforms is useful

datumbox · 2021-06-10T11:45:50Z

@vadimkantorov The tag transforms includes to anything under the transform module, including augmentations, functional and class transforms. Similar requests have been raised in the past (see #3224 for summary), so that's something we are looking into.

fmassa · 2021-06-21T14:29:37Z

@vadimkantorov I think adding a rgb2lab would be nice in principle, but the current issue I see with it is that there is no way for the library to know if the input tensor that you are passing is indeed in rgb colorspace. So you can silently get wrong results if you are not careful.
As of now, torchvision assumes that the tensor is either in RGB or grayscale space, and don't try to make any further assumptions otherwise. This works fine because RGB and grayscale have different number of channels, but things break if we start supporting HSV / LAB / etc colorspaces.

Wondering if you think this is a problem per se, and that this is up to the user to make sure that the input tensors are in RGB colorspace?

vadimkantorov · 2021-06-21T16:42:39Z

I think it's not a problem per se, as we almost do not encode semantics into the tensor type system in PyTorch, in the goal of being flexible and letting users quickly build on things. Same for NumPy.

This problem persists with OpenCV returning BGR images, but it's up to users to remember to convert if this is needed.

In my own practice, I use suffix notation to bring extra attention to semantics, e.g. I use names like mean_rgb, std_rgb in my input normalization functions, especially if I do image reading with OpenCV.

If these functions stay in extra namespace and are not discussed in every other tutorial, the use will be limited to those looking for them, so it should not confuse the users of the most common usecases.

oke-aditya · 2021-06-28T05:06:37Z

If I understand the problem correctly. In torchvision it is assumed that the colorspace is RGB .
I think if we provide conversions from RGB to other colorspaces and vice versa it would work.

We need to add documentation saying that all the transforms work on RGB Colorspace only and we do not support any other.

This is something we faced earlier in bbox conversions, and we followed xyxy assumption, provided a utility to convert to xyxy and from xyxy.

I guess a similar solution would work here?

Would love to hear from @fmassa and @vadimkantorov 😃

vadimkantorov · 2021-06-28T07:31:38Z

I think many transforms actually don't care about the input color space and support everything.

Yes, it would be good to add notes to the docs about supported color spaces. But even if it's said somewhere in the docs that the transforms were designed for RGB. And then there're manual color space conversion utils on the side, it would be sufficient and not cause confusion, as long as the image loaders load in RGB by default.

NicolasHug · 2021-06-28T08:24:58Z

I'm curious about the use-case for this, @vadimkantorov could you describe a bit more why this would be useful to you?

I assume that passing a LAB image into a pretrained model would lead to invalid predictions, so I guess this would only be for new training experiments?

vadimkantorov · 2021-06-28T08:28:55Z

You are right. Of course it would be invalid to put in LAB image into a RGB-trained model. So yes, this would only be for new models or new functionality.

My usecase is this: reimplement SLIC superpixels extraction in PyTorch, this requires rgb2lab

NicolasHug · 2021-06-28T09:54:05Z

Thanks for the feedback!

If we consider that interpolation works the same in RGB and in Lab, it's true that most transforms don't make a strong assumption w.r.t. the color space, except maybe stuff like ColorJitter or ToGrayScale.

But as noted above, there aren't just transforms: there are also operators and models, and at the moment literally everything in torchvision is built with RGB/Grayscale in mind.

On top of that, for consistency and coherence of the library, I think it's best not to introduce things for which we don't have an explicit need within the library.

It's still valuable to support new/alternative use-cases though, but IMHO this would be better done in a somewhat separate way, something that we don't officially support but that power-users could still find and use, at their own risk. Typically such conversion snippet could be part of an FAQ entry with a strong disclaimer regarding its compatibility/support... but we don't have an FAQ. Maybe it could be part of a gallery example as well? Or even just a snippet here below that people could copy/paste?

vadimkantorov · 2021-06-28T10:45:07Z

Well, one also does not have explicit need for drawing bounding boxes within the library (and yet it is was merged in). I think this argument is not very valid. At the end, torchvision is supposed to help users with standardized ways of doing very frequent chores in comptuer vision tasks. Reference implementation of color space transformation is one of these (and it's available in OpenCV since its beginning), and it'd be useful to have one working with GPU tensors in core - like in Kornia (as opposed to OpenCV / skimage).

I think putting it into torchvision.color or torchvision.utils.color would make sense and not confuse anyone. The fact that the models don't support all possible color spaces is obvious, but this should not prevent adding these conversion reference implementations. Putting them into FAQ is also a not -bad option, but I don't see why having these copy-pastes in various codebases woudld be better than providing a stadnard tested implementation. Moreover, conversion APIs have stabilized over the dozens of years in various libraries, so probably a good APIs can just be copied from some other libraries and would not be subject of frequent change. A colorsys module also exists in Python core standard library, but it's not working with arrays, forcing one to write loops.

NicolasHug · 2021-06-28T10:58:31Z

Well, one also does not have explicit need for drawing bounding boxes within the library (and yet it is was merged in). I think this argument is not very valid

That's why it's a visualization util, not part of the core set of transforms / operators. And the drawing of bounding boxes and masks perfectly integrates with the rest of the library's ecosystem in terms of expected inputs and outputs. It's not an outlier, which a Lab conversion transform would be.

I think putting it into torchvision.color or torchvision.utils.color would make sense and not confuse anyone

Unfortunately, we already have the rgb <-> grayscale conversions in the transforms module.

The fact that the models don't support all possible color spaces is obvious

It's obvious now, because we only support RGB. The second we start having other color spaces conversions, users will assume that the models and transforms will "magically" work with whatever tensor you pass. I'm afraid that this isn't an assumption we can make, and as library authors we try hard to prevent users from doing wrong things.

vadimkantorov · 2021-06-28T11:37:48Z

It's not an outlier, which a Lab conversion transform would be.

Lab, YCrCb, HSV conversions are classical computer vision tools, far from an outlier in the broader context

Unfortunately, we already have the rgb <-> grayscale conversions in the transforms module.

If wanted, this could be refactored into another namespace like what happened with core torch torch.linalg and torch.fft new namespaces (and there the use was probably much higher)

A new "util" namespace can be started even if some older methods exist in "transforms" to enable and help with classical computer vision pipelines.

The second we start having other color spaces conversions, users will assume that the models and transforms will "magically" work with whatever tensor you pass. I'

I think this is false. Users are already well-acquainted that preprocessing is a must (mean subtraction; plus [-1, 1] normalization).

we try hard to prevent users from doing wrong things.

Well, nothing can prevent users to corrupt the input tensor if they are feeling playful, and the library can't do anything about this in the end besides clear documentation. If these colorspace utils are not advertised in the main tutorials it should not be a problem. I don't see why anyone would out-of-the-blue apply colorsys functions

cc @fmassa

NicolasHug · 2021-06-28T12:01:24Z

Lab, YCrCb, HSV conversions are classical computer vision tools, far from an outlier in the broader context

They'd be outliers in torchvision. That's my main point: the coherence of a library is very important both for its maintenance and for its users. Yes, we want to support as many use-cases as possible, but we can't just say "oh this will be useful to someone, let's put it in". On top of preventing wrong results, the scope of a library is also important. As far as I can tell, Lab conversions are out of scope for the core of torchvision, at least at the moment. But we can think of ways to help the users who need it to implement it easily.

If wanted, this could be refactored into another namespace like what happened with core torch torch.linalg and torch.fft new namespaces (and there the use was probably much higher)

Deprecating the widely used rgb <-> grayscale transforms just so we can introduce lab conversions in a new submodule so that users don't consider Lab images to be fully supported isn't a reasonable strategy, unfortunately. (this sentence is way too long already, which is an indicator that there are too many things going on).

And it's very different from what happened to torch.linalg BTW, which is not just a namespace change. The new linalg module comes with tons of new features and improvements: https://60c76d975cb350913c6c73c8--shiftlab-pytorch-github-io.netlify.app/blog/pytorch-1.9-released/#torchlinalg

Well, nothing can prevent users to corrupt the input tensor if they are feeling playful, and the library can't do anything about this in the end besides clear documentation.

You're right. The point is that we should prevent what we can prevent, still. I'm not sure such argument really helps moving the discussion forward.

vadimkantorov · 2021-06-28T12:05:29Z

I'm not sure such argument really helps moving the discussion forward.

Hereby I disengage :) Please feel free to close the issue as "wontfix"

datumbox · 2021-06-28T12:27:33Z

@vadimkantorov I think it's good that we have your use-case documented thoroughly in this post. There are parallel discussions on how to improve our transforms, so I think having this information is useful and can steer us to the right direction.

Concerning moving forward with this proposal, I think more time and discussions are needed and as Nicolas mentioned, we need to find ways to avoid confusion. At the moment, it's unclear to me how we could add this and similar features without breaking compatibility and bloating the library. It's definitely on our radar to improve the situation, so please keep nudging us and providing your feedback. So I think instead of marking this as "wontfix", it's preferable to leave this open and mark it for additional discussion.

oke-aditya · 2021-06-28T13:00:25Z

Hi @datumbox @NicolasHug @vadimkantorov

I do have a reasonable solution, though this might be improved upon with discussion.
(This might look like I'm rushing to solution, but maybe I will forget about this and forget the solution too :( )

Create a simple transform in torchvision.transforms called ConvertColorspace().
Note that this would not change RGB to Grayscale. As changing number of channels makes it inconsistent.

Usage:

# Instead of strings we can use Enums instead
new_img = F.convert_colorspace(orig_img, in_color="RGB", out_color="LAB")

# internally we do all these conversions with `_methods` in `_convert_colorspace.py` file. 
# Not exposing these to user.

Similarly we can provide a class for same T.ConverColorspace.

Bloating issue
This does not bloat the library as we add only 1 function and class (Single responsibility 😄 )

Compatibility Issue
Well this does not mean that we support we support BGR / LAB / HSV / colorspaces images in transforms.
We have written all the transforms with only RGB in mind.
Note that models / any other utility / ops will not support colorspace other than RGB.

Maintenance Issue: -
Well I think this does not break existing code. We don't introduce a namespace so lowering down the maintenance.

Benefit to end user: -

GPU accelerated!
Works on batches.
Scriptable
Composable
Tested

I would love to hear thoughts from all 😄

I hope that I have proposed something reasonable. Feel free to correct me!

fmassa · 2021-06-29T14:12:29Z

Hi,

Let me chime in a bit on this discussion. Thanks a ton @vadimkantorov, @NicolasHug , @datumbox and @oke-aditya for ideas.

This problem persists with OpenCV returning BGR images, but it's up to users to remember to convert if this is needed.

Yes, and this is a recurring issue everywhere in the python ecosystem. How does PIL solve this issue? By attaching a mode metadata to the Image object.

This is not the first instance of this type of problem: bounding boxes have the same problem, as they can have different input representations (xyxy, xywh, etc, as pointed out by @oke-aditya in his comment).

While it would be easy to add those color conversion functions to torchvision, the problem I want to minimize is for users to get silent bugs because of using LAB / HSV colorspaces.
One example: LAB representation is not in the same scale as RGB (the A and B channels don't have a fixed range IIRC). How does convert_image_dtype work in this situation? Can we support LAB images in uint8 format? Should resize clamp values in 0-1 for float tensors (won't work for LAB)? Should make_grid and other visualization functions work with lab colorspace? What happens if we blend a LAB image with a RGB image, should it fail?

Those are all questions that it would be good for us to answer before implementing such functions, so that we have a coherent story.

Here is one proposal (which is not good enough, but is one option):

class ImageTensor(torch.Tensor):
    def __init__(self, *args, *, mode='rgb', **kwargs):
        super().__init__(*args, **kwargs)
        self._img_mode = mode
        
    # do the other stuff needed so that `._img_mode` gets propagated
    # via __torch_function__, for functions that make sense

In this representation, ImageTensor is just like a Tensor, but carries metadata about what is the colorspace for a given tensor.

A similar approach could be done for bounding boxes as well, and segmentation masks.

But what is the gotcha with this approach? torchscript won't work because of __torch_function__.

oke-aditya · 2021-06-29T17:44:09Z

Some great thoughts by @fmassa .
These have broadened the discussion.

I had thought of creating similar solution for bounding boxes and segmentation masks.
Such solution allows lot of flexibility and potentially can help in solving Detection transforms as well as rotated boxes, etc.

As similar to ImageTensor classes such as Boxes, Masks can help. (Something similar what Detectron2 does)
Another point I would like to add is

-> Creating Classes abstracts the idea of tensor!

All the downstream libraries built on top of torchvision now need to use Boxes, ImageTensor dtypes.
These classes are like Tensor but not the same. This presents an abstraction, sometimes limitations and a small learning curve for users. All downstream libraries cannot simply use a Tensor [N, 4] for boxes, etc.

Well this is a trade-off, and a big breaking change too.

fmassa · 2021-06-30T10:52:01Z

Such solution allows lot of flexibility and potentially can help in solving Detection transforms as well as rotated boxes, etc.

Yes, exactly. That is the original motivation. I'm not yet 100% sure if this is the best way to go forward, but it's one possibility.

Let's redirect discussions about transforms to the RFC from @pmeier pmeier/torchvision-datasets-rework#1 for now. We will be iterating on the design to get to a state where we are all happy with.

All the downstream libraries built on top of torchvision now need to use Boxes, ImageTensor dtypes.
All downstream libraries cannot simply use a Tensor [N, 4] for boxes, etc.
Well this is a trade-off, and a big breaking change too.

Those are good points, and it's one of the reasons why we have been careful without adding those features without careful thought. There is probably a way of making current functions still work in a BC way (as a Boxes is still a Tensor), but torchscript support might be tricky to get and needs to be evaluated.

vadimkantorov · 2022-01-17T13:58:03Z

rgb2hsv / hsv2rgb is also useful for interpolating between two colors: https://stackoverflow.com/questions/13488957/interpolate-from-one-color-to-another, this is useful for visualizations

vadimkantorov · 2022-07-08T14:03:34Z

rgb2gray has been implemented as https://pytorch.org/vision/stable/generated/torchvision.transforms.functional.rgb_to_grayscale.html#torchvision.transforms.functional.rgb_to_grayscale

other transforms existing in opencv / kornia / colorsys / skimage would still be nice to have (rgb<->hsv etc)

It would be nice to have them available in the standard library (both on CPU/GPU) rather than rolling your own. Kornia seems to have implemented it: https://kornia.readthedocs.io/en/latest/color.html, skimage as well: https://scikit-image.org/docs/dev/api/skimage.color.html, python core as well: https://docs.python.org/3/library/colorsys.html, but I think it's basic enough to merit inclusion of similar functions in core.

(my usecase: SLIC superpixel extraction which requires rgb2lab and selective search algorithm which requires rgb2hsv and rgb2lab)

vadimkantorov · 2022-07-10T15:07:55Z

btw, TensorFlow's tf.image package does contain rgb_to_hsv, rgb_to_yuv and a few others: https://www.tensorflow.org/api_docs/python/tf/image/rgb_to_hsv

oke-aditya · 2022-07-10T16:52:48Z

It might be feasible with new transforms API (which I haven't explored yet) cc @vfdev-5

vadimkantorov · 2023-06-21T08:27:39Z

Btw _rgb2hsv/_hsv2rgb/_rgb_to_grayscale are already implemented in https://github.com/pytorch/vision/blob/main/torchvision/transforms/_functional_tensor.py / https://github.com/pytorch/vision/blob/main/torchvision/transforms/v2/functional/_color.py and used as part of adjust_hue, but of course rgb2hsv/hsv2rgb would be useful as public APIs in other contexts too

Expose hsv_to_rgb and rgb_to_hsv #8179

datumbox added enhancement module: transforms labels Jun 10, 2021

oke-aditya mentioned this issue Jun 14, 2021

[RFC] New Augmentation techniques in Torchvison #3817

Open

17 tasks

datumbox added the needs discussion label Jun 28, 2021

vadimkantorov changed the title ~~[feature request] rgb2lab and other color space conversions~~ [feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions Jan 17, 2022

vadimkantorov changed the title ~~[feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions~~ [feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia?) Jan 17, 2022

vadimkantorov changed the title ~~[feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia?)~~ [feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia? or colorsys python module?) Jan 18, 2022

pmeier mentioned this issue Oct 3, 2022

add Video feature and kernels #6667

Merged

pmeier mentioned this issue Jan 13, 2023

[NOMRG] TransformsV2 TODOs #7082

Closed

vadimkantorov mentioned this issue Jan 26, 2023

Remove color_space metadata and ConvertColorSpace() transform #7120

Merged

NicolasHug mentioned this issue Mar 14, 2023

Recent changes to transforms v2 #7384

Open

NicolasHug mentioned this issue Jan 2, 2024

Expose hsv_to_rgb and rgb_to_hsv #8179

Open

vadimkantorov mentioned this issue Jun 17, 2024

NV12/YUV->RGB colour accuracy and CUDA pytorch/audio#3799

Open

[feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia? or colorsys python core module?) #4029

[feature request] rgb2lab / rgb2hsv / rgb2gray and other color space conversions (maybe upstream from kornia? or colorsys python core module?) #4029

Comments

vadimkantorov commented Jun 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

datumbox commented Jun 10, 2021

Uh oh!

vadimkantorov commented Jun 10, 2021

Uh oh!

datumbox commented Jun 10, 2021

Uh oh!

fmassa commented Jun 21, 2021

Uh oh!

vadimkantorov commented Jun 21, 2021

Uh oh!

oke-aditya commented Jun 28, 2021

Uh oh!

vadimkantorov commented Jun 28, 2021

Uh oh!

NicolasHug commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadimkantorov commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented Jun 28, 2021

Uh oh!

vadimkantorov commented Jun 28, 2021

Uh oh!

NicolasHug commented Jun 28, 2021

Uh oh!

vadimkantorov commented Jun 28, 2021

Uh oh!

NicolasHug commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadimkantorov commented Jun 28, 2021

Uh oh!

datumbox commented Jun 28, 2021

Uh oh!

oke-aditya commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa commented Jun 29, 2021

Uh oh!

oke-aditya commented Jun 29, 2021

Uh oh!

fmassa commented Jun 30, 2021

Uh oh!

vadimkantorov commented Jan 17, 2022

Uh oh!

vadimkantorov commented Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadimkantorov commented Jul 10, 2022

Uh oh!

oke-aditya commented Jul 10, 2022

Uh oh!

vadimkantorov commented Jun 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vadimkantorov commented Jun 9, 2021 •

edited

Loading

NicolasHug commented Jun 28, 2021 •

edited

Loading

vadimkantorov commented Jun 28, 2021 •

edited

Loading

NicolasHug commented Jun 28, 2021 •

edited

Loading

oke-aditya commented Jun 28, 2021 •

edited

Loading

vadimkantorov commented Jul 8, 2022 •

edited

Loading

vadimkantorov commented Jun 21, 2023 •

edited

Loading