Skip to content

Refactor of transforms #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 26, 2017
Merged

Refactor of transforms #240

merged 10 commits into from
Sep 26, 2017

Conversation

chsasank
Copy link
Contributor

@chsasank chsasank commented Sep 3, 2017

To allow easy subclassing to extend to joint transforms.

See #230.
First cut implementation. Needs polish, I guess.
Lot of things are missing, like docs etc.

(cherry picked from commit 71afec427baca8e37cd9e10d98812bc586e9a4ac)
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks good.
I think though that it might be better to make the random parameter generation as standalone functions, what do you think?
Also, it might be good to add some extra type checks to verify that the user passed the right type to the functions, as they can either be PILImage objects or torch Tensors.

return img.crop((x, y, x + w, y + h))


def scaled_crop(img, x, y, w, h, size, interpolation=Image.BILINEAR):

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -214,6 +254,13 @@ def __init__(self, size):
else:
self.size = size

def get_params(self, img):

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -298,6 +342,16 @@ def __init__(self, size, padding=0):
self.size = size
self.padding = padding

def get_params(self, img):

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -352,7 +401,7 @@ def __init__(self, size, interpolation=Image.BILINEAR):
self.size = size
self.interpolation = interpolation

def __call__(self, img):
def get_params(self, img):

This comment was marked as off-topic.


def __call__(self, img):
x, y, w, h = self.get_params(img)
return scaled_crop(img, x, y, w, h, self.size, self.interpolation)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.



def crop(img, x, y, w, h):
return img.crop((x, y, x + w, y + h))

This comment was marked as off-topic.



def normalize(tensor, mean, std):
# TODO: make efficient

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Contributor

@alykhantejani alykhantejani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chsasank this looks very good. I've left some inline comments.

I also agree with @fmassa about having standalone functions to genterate random crop params, as if we want to allow the users to use purely the functional interface (to generate complex transform graphs), they will need a way to generate these params (without creating an object)

return img


def to_pilimage(pic):

This comment was marked as off-topic.

y1 = int(round((h - th) / 2.))
return img.crop((x1, y1, x1 + tw, y1 + th))
x1, y1, tw, th = self.get_params(img)
return crop(img, x1, y1, tw, th)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@@ -260,7 +344,7 @@ def __call__(self, img):
Returns:
PIL.Image: Padded image.
"""
return ImageOps.expand(img, border=self.padding, fill=self.fill)
return pad(img, self.padding, self.fill)


class Lambda(object):

This comment was marked as off-topic.

This comment was marked as off-topic.


def __call__(self, img):
x, y, w, h = self.get_params(img)
return scaled_crop(img, x, y, w, h, self.size, self.interpolation)

This comment was marked as off-topic.

@chsasank
Copy link
Contributor Author

reminder @alykhantejani @fmassa.

@alykhantejani
Copy link
Contributor

Hi @chsasank from my side this looks pretty good although I think we should address the following before merging:

  1. For RandomCrop and CenterCrop I think we should not have the objects doing any of the parameter generation, but instead, we should do one of the following two possibilities:
    Option 1
  • have a crop(x,y,w,h) function that users can call with any params (and do the param generation themselves).
  • have a random_crop(size) and a center_crop(size) function that contain the param generation and are convenience functions.
  • The RandomCrop and CenterCrop objects would just call these conveniences functions

Option 2
If we think the random parameter generation is sufficiently complex then just have a single crop function (which is a wrapper around PIL's Image.crop) and helper functions to generate random crop bounds and center crop bounds.

The objects would then call a combination of get_x_params and crop

  1. nit: rename to_pilimage to to_pil_image

@fmassa
Copy link
Member

fmassa commented Sep 16, 2017

Sorry for the delay in reviewing.

I think this is almost ready to go. About @alykhantejani last comments, I'd go with a variant of option 2.

Given that one of the goals of this refactoring is to be able to extend random transforms to many inputs, I think it's better not to add randomness in the operations (so I'd avoid random_crop functions).
But I'm also unsure on where we should add the random parameter generation in order not to clutter the namespace (as they are very tied to the current transform classes).

What about letting the get_params be a @staticmethod, so that it can be called without instantiating the class? Also, I'm not 100% satisfied with this name, but I have no better ideas now.

Thoughts?

@chsasank
Copy link
Contributor Author

chsasank commented Sep 16, 2017

What about letting the get_params be a @staticmethod, so that it can be called without instantiating the class?

I like this idea too. I will go ahead with it, I guess.

@chsasank
Copy link
Contributor Author

Most of the requested changes are done. Also added documentation.

Do you want the new functions in a separate namespace? If so, any name suggestion?

Args:
img (PIL.Image): Image to be scaled.
size (sequence or int): Desired output size. If size is a sequence like
(w, h), output size will be matched to this. If size is an int,

This comment was marked as off-topic.

@alykhantejani
Copy link
Contributor

Hi @chsasank this looks good to go once the merge conflicts are resolved. As for namespace, I don't like that we can now do from torchvision.transforms import ToTensor and from torchvision.transforms import to_tensor but don't have any better ideas right now...

@chsasank
Copy link
Contributor Author

Oh no. That means I've to change whole lot of things. Like parameters order in scale, crop functions etc.

@alykhantejani
Copy link
Contributor

@chsasank Just the params in Scale changed order, so should be quite a small change.

return ImageOps.expand(img, border=padding, fill=fill)


def crop(img, x, y, w, h):

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Contributor

@alykhantejani alykhantejani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chsasank looks good - just a few final small things to change and I think we're good to merge.

return ImageOps.expand(img, border=padding, fill=fill)


def crop(img, x, y, w, h):

This comment was marked as off-topic.

return img.crop((x, y, x + w, y + h))


def scaled_crop(img, x, y, w, h, size, interpolation=Image.BILINEAR):

This comment was marked as off-topic.

y: Upper pixel coordinate.
w: Width of the cropped image.
h: Height of the cropped image.
size (sequence or int): Desired output size. Same semantics as ``scale``.

This comment was marked as off-topic.

PIL.Image: Cropped image.
"""
assert _is_pil_image(img), 'img should be PIL Image'
img = crop(img, x, y, w, h)

This comment was marked as off-topic.

@fmassa
Copy link
Member

fmassa commented Sep 25, 2017

@chsasank once you change the x,y->i,j order this is good to merge! Thanks a lot!

@chsasank
Copy link
Contributor Author

Sorry, got a bit late. Let me know if I have to change anything else.

@soumith soumith merged commit 459dc59 into pytorch:master Sep 26, 2017
@soumith
Copy link
Member

soumith commented Sep 26, 2017

🍾 🎆

@fmassa
Copy link
Member

fmassa commented Sep 26, 2017

Thanks a lot @chsasank and sorry for the delay in finishing reviewing the PR!
This is going to be very helpful!

@alykhantejani
Copy link
Contributor

Thanks a lot @chsasank! 🎉

chsasank added a commit to chsasank/vision that referenced this pull request Sep 26, 2017
soumith added a commit that referenced this pull request Sep 26, 2017
VerticalFlip converted to follow refactor #240
rajveerb pushed a commit to rajveerb/vision that referenced this pull request Nov 30, 2023
* Delete Caffe2 object_detection

* Added new pytorch-based object_detection

* object_detection: removed unused configs; deleted misleading code

* object_detection Dockerfile now based on public image and specifies exact library versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants