Refactor of transforms #240

chsasank · 2017-09-03T18:15:09Z

To allow easy subclassing to extend to joint transforms.

See #230.
First cut implementation. Needs polish, I guess.
Lot of things are missing, like docs etc.

(cherry picked from commit 71afec427baca8e37cd9e10d98812bc586e9a4ac)

fmassa

This overall looks good.
I think though that it might be better to make the random parameter generation as standalone functions, what do you think?
Also, it might be good to add some extra type checks to verify that the user passed the right type to the functions, as they can either be PILImage objects or torch Tensors.

torchvision/transforms.py

+    return img.crop((x, y, x + w, y + h))
+
+
+def scaled_crop(img, x, y, w, h, size, interpolation=Image.BILINEAR):


torchvision/transforms.py

@@ -214,6 +254,13 @@ def __init__(self, size):
        else:
            self.size = size

+    def get_params(self, img):


torchvision/transforms.py

@@ -298,6 +342,16 @@ def __init__(self, size, padding=0):
            self.size = size
        self.padding = padding

+    def get_params(self, img):


torchvision/transforms.py

@@ -352,7 +401,7 @@ def __init__(self, size, interpolation=Image.BILINEAR):
        self.size = size
        self.interpolation = interpolation

-    def __call__(self, img):
+    def get_params(self, img):


torchvision/transforms.py

+
+    def __call__(self, img):
+        x, y, w, h = self.get_params(img)
+        return scaled_crop(img, x, y, w, h, self.size, self.interpolation)


torchvision/transforms.py

+
+
+def crop(img, x, y, w, h):
+    return img.crop((x, y, x + w, y + h))


torchvision/transforms.py

+
+
+def normalize(tensor, mean, std):
+    # TODO: make efficient


alykhantejani

Thanks @chsasank this looks very good. I've left some inline comments.

I also agree with @fmassa about having standalone functions to genterate random crop params, as if we want to allow the users to use purely the functional interface (to generate complex transform graphs), they will need a way to generate these params (without creating an object)

torchvision/transforms.py

+        return img
+
+
+def to_pilimage(pic):


torchvision/transforms.py

-        y1 = int(round((h - th) / 2.))
-        return img.crop((x1, y1, x1 + tw, y1 + th))
+        x1, y1, tw, th = self.get_params(img)
+        return crop(img, x1, y1, tw, th)


torchvision/transforms.py

@@ -260,7 +344,7 @@ def __call__(self, img):
        Returns:
            PIL.Image: Padded image.
        """
-        return ImageOps.expand(img, border=self.padding, fill=self.fill)
+        return pad(img, self.padding, self.fill)


 class Lambda(object):


torchvision/transforms.py

+
+    def __call__(self, img):
+        x, y, w, h = self.get_params(img)
+        return scaled_crop(img, x, y, w, h, self.size, self.interpolation)


chsasank · 2017-09-16T08:42:18Z

reminder @alykhantejani @fmassa.

alykhantejani · 2017-09-16T10:31:55Z

Hi @chsasank from my side this looks pretty good although I think we should address the following before merging:

For RandomCrop and CenterCrop I think we should not have the objects doing any of the parameter generation, but instead, we should do one of the following two possibilities:
Option 1

have a crop(x,y,w,h) function that users can call with any params (and do the param generation themselves).
have a random_crop(size) and a center_crop(size) function that contain the param generation and are convenience functions.
The RandomCrop and CenterCrop objects would just call these conveniences functions

Option 2
If we think the random parameter generation is sufficiently complex then just have a single crop function (which is a wrapper around PIL's Image.crop) and helper functions to generate random crop bounds and center crop bounds.

The objects would then call a combination of get_x_params and crop

nit: rename to_pilimage to to_pil_image

fmassa · 2017-09-16T13:45:34Z

Sorry for the delay in reviewing.

I think this is almost ready to go. About @alykhantejani last comments, I'd go with a variant of option 2.

Given that one of the goals of this refactoring is to be able to extend random transforms to many inputs, I think it's better not to add randomness in the operations (so I'd avoid random_crop functions).
But I'm also unsure on where we should add the random parameter generation in order not to clutter the namespace (as they are very tied to the current transform classes).

What about letting the get_params be a @staticmethod, so that it can be called without instantiating the class? Also, I'm not 100% satisfied with this name, but I have no better ideas now.

Thoughts?

chsasank · 2017-09-16T15:59:08Z

What about letting the get_params be a @staticmethod, so that it can be called without instantiating the class?

I like this idea too. I will go ahead with it, I guess.

chsasank · 2017-09-16T17:35:12Z

Most of the requested changes are done. Also added documentation.

Do you want the new functions in a separate namespace? If so, any name suggestion?

torchvision/transforms.py

+    Args:
+        img (PIL.Image): Image to be scaled.
+        size (sequence or int): Desired output size. If size is a sequence like
+            (w, h), output size will be matched to this. If size is an int,


alykhantejani · 2017-09-17T10:23:08Z

Hi @chsasank this looks good to go once the merge conflicts are resolved. As for namespace, I don't like that we can now do from torchvision.transforms import ToTensor and from torchvision.transforms import to_tensor but don't have any better ideas right now...

chsasank · 2017-09-18T08:23:09Z

Oh no. That means I've to change whole lot of things. Like parameters order in scale, crop functions etc.

alykhantejani · 2017-09-18T09:26:23Z

@chsasank Just the params in Scale changed order, so should be quite a small change.

torchvision/transforms.py

+    return ImageOps.expand(img, border=padding, fill=fill)
+
+
+def crop(img, x, y, w, h):


alykhantejani

Thanks @chsasank looks good - just a few final small things to change and I think we're good to merge.

torchvision/transforms.py

+    return ImageOps.expand(img, border=padding, fill=fill)
+
+
+def crop(img, x, y, w, h):


torchvision/transforms.py

+    return img.crop((x, y, x + w, y + h))
+
+
+def scaled_crop(img, x, y, w, h, size, interpolation=Image.BILINEAR):


torchvision/transforms.py

+        y: Upper pixel coordinate.
+        w: Width of the cropped image.
+        h: Height of the cropped image.
+        size (sequence or int): Desired output size. Same semantics as ``scale``.


torchvision/transforms.py

+        PIL.Image: Cropped image.
+    """
+    assert _is_pil_image(img), 'img should be PIL Image'
+    img = crop(img, x, y, w, h)


fmassa · 2017-09-25T20:44:33Z

@chsasank once you change the x,y->i,j order this is good to merge! Thanks a lot!

chsasank · 2017-09-26T13:09:24Z

Sorry, got a bit late. Let me know if I have to change anything else.

soumith · 2017-09-26T14:17:33Z

🍾 🎆

fmassa · 2017-09-26T14:36:09Z

Thanks a lot @chsasank and sorry for the delay in finishing reviewing the PR!
This is going to be very helpful!

alykhantejani · 2017-09-26T14:38:43Z

Thanks a lot @chsasank! 🎉

VerticalFlip converted to follow refactor #240

* Delete Caffe2 object_detection * Added new pytorch-based object_detection * object_detection: removed unused configs; deleted misleading code * object_detection Dockerfile now based on public image and specifies exact library versions

chsasank added 2 commits September 3, 2017 23:35

First cut refactoring

5a2bbc5

(cherry picked from commit 71afec427baca8e37cd9e10d98812bc586e9a4ac)

Modify assert for pad

bf38166

chsasank mentioned this pull request Sep 3, 2017

Proposal for extending transforms #230

Closed

fmassa requested changes Sep 3, 2017

View reviewed changes

Asserts for functions

7aeec57

fmassa mentioned this pull request Sep 3, 2017

Random transforms for both input and target? #9

Closed

raise TypeErrors instead of assertins

8b18f52

alykhantejani reviewed Sep 4, 2017

View reviewed changes

fmassa mentioned this pull request Sep 5, 2017

OpenCV transforms with tests #34

Closed

This was referenced Sep 13, 2017

Separate random generation from transforms #115

Closed

Multicrop - missing feature #61

Closed

chsasank added 2 commits September 16, 2017 22:16

Make get_params static method

4390b55

Add documentation

f4ddc92

Fix a bug in randomsizedcrop

538d87b

alykhantejani reviewed Sep 17, 2017

View reviewed changes

torchvision/transforms.py Outdated

Args:

img (PIL.Image): Image to be scaled.

size (sequence or int): Desired output size. If size is a sequence like

(w, h), output size will be matched to this. If size is an int,

This comment was marked as off-topic.

Sign in to view

chsasank commented Sep 19, 2017

View reviewed changes

chsasank added 2 commits September 19, 2017 11:22

scale change to (h, w) ordering. (based on pytorch#256)

4d7f70b

Merge branch 'master' of github.com:pytorch/vision

47800d4

alykhantejani reviewed Sep 19, 2017

View reviewed changes

This was referenced Sep 19, 2017

add RandomVerticalFlip transform #262

Merged

[Feature Request] More Image Transforms (Brightness, Contrast, Hue) #271

Closed

change x,y,w,h -> i,j,h,w

2cc58ed

soumith merged commit 459dc59 into pytorch:master Sep 26, 2017

chsasank added a commit to chsasank/vision that referenced this pull request Sep 26, 2017

VerticalFlip converted to follow refactor pytorch#240

256495f

chsasank mentioned this pull request Sep 26, 2017

VerticalFlip converted to follow refactor #240 #272

Merged

soumith added a commit that referenced this pull request Sep 26, 2017

Merge pull request #272 from chsasank/flip

a5b75c8

VerticalFlip converted to follow refactor #240

chsasank mentioned this pull request Sep 26, 2017

TenCrop and FiveCrop refactored #273

Merged

Noiredd mentioned this pull request Jul 25, 2019

[Feature proposal] Allow processing multiple images with transforms.Compose #1169

Open

pmeier mentioned this pull request Jan 20, 2023

[PoC] reinstate get_params #7095

Closed

		return img.crop((x, y, x + w, y + h))


		def scaled_crop(img, x, y, w, h, size, interpolation=Image.BILINEAR):



		def crop(img, x, y, w, h):
		return img.crop((x, y, x + w, y + h))

		return ImageOps.expand(img, border=padding, fill=fill)


		def crop(img, x, y, w, h):

Refactor of transforms #240

Refactor of transforms #240

Uh oh!

Conversation

chsasank commented Sep 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

alykhantejani left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

chsasank commented Sep 16, 2017

Uh oh!

alykhantejani commented Sep 16, 2017

Uh oh!

fmassa commented Sep 16, 2017

Uh oh!

chsasank commented Sep 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chsasank commented Sep 3, 2017 •

edited

Loading

chsasank commented Sep 16, 2017 •

edited

Loading