Adds bounding boxes conversion #2710

oke-aditya · 2020-09-27T12:49:31Z

Added code
Added documentation
Added tests

I have added as per the Issue, two utility function to convert boxes to pascal VOC format (x1 y1 x2 y2).

The tests convert boxes to other format and vice-versa. This ensures that both operations are identical and can be interchangeably used. Tests passed locally.

This is ready for review. Do let me know!

cc @pmeier

codecov · 2020-09-27T13:18:38Z

Codecov Report

Merging #2710 into master will increase coverage by 0.12%.
The diff coverage is 89.83%.

@@            Coverage Diff             @@
##           master    #2710      +/-   ##
==========================================
+ Coverage   72.93%   73.05%   +0.12%     
==========================================
  Files          95       96       +1     
  Lines        8239     8298      +59     
  Branches     1279     1291      +12     
==========================================
+ Hits         6009     6062      +53     
  Misses       1838     1838              
- Partials      392      398       +6

Impacted Files	Coverage Δ
torchvision/ops/boxes.py	`93.25% <78.57%> (-6.75%)`	⬇️
torchvision/ops/__init__.py	`100.00% <100.00%> (ø)`
torchvision/ops/_box_convert.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 786ec32...2425f45. Read the comment docs.

pmeier

Thanks @oke-aditya for the PR! Few comments below. Additionally:

For completeness: shouldn't we also have box_cxcywh_to_xywh and box_xywh_to_cycywh? I know you advocated against it, but I think we should discuss this. @fmassa?
Why did you add the pytorch-sphinx-theme as submodule? I'm pretty sure we shouldn't do that, as the documentation is not build here.

test/test_ops.py

torchvision/ops/boxes.py

oke-aditya · 2020-09-27T14:33:27Z

I will do the suggested changes. I just wrote code that just works for now, I guess it needs cleaning as suggested.

Also,

I advised against it as we can simply cascade these two operations. box_xyxy_to_xywh(box_cxcy_to_xyxy) and obtain the required result if needed. This is something which I guess we can leave or have a function which exactly does this. This can be discussed and added if needed.
Very sorry for sphinx docs, I committed and pushed by mistake. I have removed them.

pmeier · 2020-09-27T14:48:24Z

I advised against it as we can simply cascade these two operations.

True, but in that case I would ask, why we went for this set of two representations and not some other way. In general, IMO for conversion functions it is always a good idea to have a "core" representation and perform all other conversions only to and from this. Since we only have 3 different representations here, I think we should simply implement them all.

Very sorry for sphinx docs

Don't be. That is why we have code review 😉

oke-aditya · 2020-09-27T16:57:31Z

The reason why this of representations is the detection models in torchvision accept xy xy format.
Also why only xywh to xyxy and cxcywh to xyxy is because only these two I could find which were being used, do let me know if other are needed!

I agree that we should provide generic functions. Right now we have only 3 different representations (cxcywh xywh xyxy) to provide.
Later if these increase, we would be bound to provide other conversions for consistency.

Hence, building minimal stuff that can do the job was the plan.

I would be happy to add these functions for interconvertability but let @fmassa have a thought.
Both the sides have fair point! It's just choice and design we decide

Signed-off-by: Aditya Oke <[email protected]>

test/test_ops.py

fmassa

Thanks a lot for the PR @oke-aditya and for the reviews @pmeier and @vfdev-5 !

I have a couple of comments, most notably that we should avoid in-place operations in the input argument, I already had subtle bugs in the past because of this (imagine your results after the 1st epoch being completely wrong).

I also have a meta question that I would like to discuss here: we originally discussed about adding 2 conversion functions (xyxy_to_xywh and the other way around), but we also added the cxcywh_to_xyxy variants as well.
This brings the question of the scalability of the approach, as for each new format it adds at least 2 new functions (or ~ 2 * (n ** 2 - (n - 1) ** 2) if we do the full conversion matrix).
For reference, Detectron2 uses a BoxMode class to represent / implement the conversion types, which lets it handle the conversions as it wishes, with only a single entry-point.

I'm not advocating for using BoxMode (or something like this), but my original idea was that we would only be adding support for xyxy and xywh, which is still manageable.

I'm looking forward to your thoughts

torchvision/ops/boxes.py

pmeier · 2020-09-28T12:19:26Z

@fmassa

This brings the question of the scalability of the approach, as for each new format it adds at least 2 new functions (or ~ 2 * (n ** 2 - (n - 1) ** 2) if we do the full conversion matrix).

Either you have a typo in your equation or something is off. It reduces down to 2 * (2*n - 1) and it should grow quadratically. IMO the number of functions for the full conversion matrix should be n * (n - 1).

I'm not advocating for using BoxMode (or something like this), but my original idea was that we would only be adding support for xyxy and xywh, which is still manageable.

I'm looking forward to your thoughts

Speaking form a position of ignorance as I've not worked with object detection very often: do we even need to scale this? I mean are there more than the three representations (bottom left with height and width / center with height and width / bottom left and top right) at all? Sure, it is possible that any corner might be used as anchor as well as any two opposite corners might be used, but is that common? If that is the case, we should discuss which variants we want to support and how elaborate the support should be.

If not I think we can implement 2 functions for each representation and be done with it as we are a general vision library rather than specialized for object detection.

oke-aditya · 2020-09-28T17:11:57Z

I'm not highly experienced or qualified but would like to share some thoughts (ignore if they make no sense).

It's not wise to have all the conversions as the matrix is quadratic and once we provide it we have to keep providing.
Probably cascading these operations gets the job done for now, so we have 2 conversions for each sets. I guess there are not many popular ways of representing boxes, I have seen just these three (correct me if I'm wrong).
If users requests to provide more such methods then we can think in future! Converting a representation to xyxy format is something we can provide easily and maintain.

Let me know thoughts, let us not create a feature that is not maintainable by us.

torchvision/ops/boxes.py

fmassa · 2020-09-29T12:58:50Z

Either you have a typo in your equation or something is off. It reduces down to 2 * (2*n - 1) and it should grow quadratically. IMO the number of functions for the full conversion matrix should be n * (n - 1).

My intent was to say how many more functions we would need to add if we were to go from n-1 to n modes.

Speaking form a position of ignorance as I've not worked with object detection very often: do we even need to scale this? I mean are there more than the three representations (bottom left with height and width / center with height and width / bottom left and top right) at all? Sure, it is possible that any corner might be used as anchor as well as any two opposite corners might be used, but is that common? If that is the case, we should discuss which variants we want to support and how elaborate the support should be.

We don't need to add all possible conversion combinations, but just by the fact that we are already adding 4 new functions makes me think that this approach doesn't scale. I'm ok to always have the conversions passing through xyxy, but still even in this case we already have a lot of functions.

Let me illustrate with another examples on why I think we should follow a different approach here:

OpenCV allow to convert images between different colorspaces, but it only provides one function for that, which accepts a second argument (flag) that indicates the direction of the conversion
PIL as well, except that it already knows the colorspace of the current image (because it's stored internally)
Detectron2 has the BoxMode class for abstracting this away
albumentations uses a single function for defining how to convert boxes, via a source_format argument (the target_format is always the same)

Here is my proposal: implement a function called convert_box(boxes, input_fmt, output_fmt) (better names welcome!) which is the single entry-point for performing those conversions. This way, the user only needs to care about one function.
input_fmt and output_fmt can be strings such as xyxy and xywh, so that we don't need to introduce any new abstractions.

Thoughts?

oke-aditya · 2020-09-29T13:24:12Z

Great thoughts. I guess it makes much more sense and generic. User can simply pass 2 strings and get his bounding boxes converted without thinking much. So user side it is less headache.

Coming to our side, if we provide such a function. We would need to provide all conversions as the user will not know which methods are possible, he would simply expect that the boxes should be converted!

All conversions can occur through xyxy internally, that is inefficient but still it would reduce our codebase and we would have less to maintain.
We can add further efficient operations without affecting the API. Internally we might have lot of functions but we expose just 1 to user.

I completely agree with this opinion, really a good idea (probably we should have discussed in issue more and I did hurry in jumping to code, sorry for that)

fmassa · 2020-09-29T15:39:59Z

Coming to our side, if we provide such a function. We would need to provide all conversions as the user will not know which methods are possible, he would simply expect that the boxes should be converted!

I would go with the approach you mentioned just afterwards -- always go through xyxy as an intermediate representation. As you said, we can always optimize in the future if we want.

Here is some pseudo-code illustrating one potential implementation:

def convert_boxes(boxes, in_fmt, out_fmt):
    allowed_fmts = ...
    assert in_fmt in allowed_fmts
    assert out_fmt in allowed_fmts
    if in_fmt == out_fmt:
        return boxes.clone()  # to ensure always returning a copy
    if in_fmt != 'xyxy' and out_fmt != 'xyxy':
        # convert one to xyxy and change either in_fmt or out_fmt to xyxy
    # dispatch to the existing functions
    ...

Also, I think it might be preferable to spell it as convert_boxes instead of convert_box because it supports multiple boxes at once, but I think we already named it box_area in the past so maybe it's not that much of an issue?

oke-aditya · 2020-09-29T16:02:44Z

We can name to convert_box it won't be an issue as we always expect box to be Tensor[N] E.g. box_area, box_iou so it stays consistent.

Right now I think I will refactor to work internally with xyxy and the prototype seems great to me.

Let me refactor the code.

Should I do in a new PR or continue here? this PR will become quite dirty.
Reason being I will need to change code, docs and tests. Though a lot of it is re-use.

I guess all these conversion functions used internally be named as _box_xyxy_to_xywh since these will be used internally and we won't provide docs as well as include them in __init__ and __all__ .

This will effectively save a lot of efforts in maintaining docs for these functions internally and just maintain clear documentation and usage of the above convert_box function.

fmassa · 2020-09-29T17:54:35Z

We can continue in this PR, and the proposal of renaming the current functions as _box_xyxy_to_xywh is what I would have suggested as well.

Good point about box_area / box_iou, which makes me think that we should maybe name it as box_convert maybe?

Don't worry about the history of the commits, it will all get squashed by GitHub before merging.

One thing to keep in mind: can you add a test for torchscriptability as well? Something like

out = box_convert(boxes, 'xyxy', 'xywh')
scripted_fn = torch.jit.script(box_convert)
out_script = scripted_fn(boxes, 'xyxy', 'xywh')
self.assertTrue((out - out_script).abs().max() < TOLERANCE)

This will ensure that our transform is ready to be exported to C++. Let us know if you have issues making the code work with torchscript.

And thanks a lot for your help!

oke-aditya · 2020-09-30T17:34:45Z

Sorry for the delay.
As per discussion, I rearranged the code a bit. I made function box_convert in boxes.py as per discussion above.

The other utility conversion functions which are few for now (might grow in future) I shifted to separate file _box_convert.py. All renamed according to conventions. Let me know if this is fine, they would pollute the boxes.py file and occupy too much unnecessary code which is abstracted out hence I shifted it.

Simply refactored the tests for this new API.

I added tests for all conversions I think, do let me know if I missed something.

Documentation is only generated for box_convert function, not for others.

Let me know if this works and if it needs changes :-)

oke-aditya · 2020-09-30T18:37:12Z

I added the JIT test as well, but it somehow kept failing for me locally, I'm not sure about it. Can someone have a look, please?
I just commented it out to avoid CI failure here.

fmassa

The code looks great, thanks a lot for all your work @oke-aditya !

Do you remember what was the test failure that you were facing with torchscript? From looking at your implementation I don't see why it should fail.

I only have a couple of documentation suggestions, the other comment can be left for a future PR

torchvision/ops/boxes.py

fmassa · 2020-10-01T09:27:04Z

torchvision/ops/boxes.py

+            if in_fmt == "xywh":
+                boxes_xyxy = _box_xywh_to_xyxy(boxes)
+                if out_fmt == "cxcywh":
+                    boxes_converted = _box_xyxy_to_cxcywh(boxes_xyxy)
+
+            elif in_fmt == "cxcywh":
+                boxes_xyxy = _box_cxcywh_to_xyxy(boxes)
+                if out_fmt == "xywh":
+                    boxes_converted = _box_xyxy_to_xywh(boxes_xyxy)


While this is fine, my first thought was to so something like the following

if in_fmt == "xywh": boxes = _box_xywh_to_xyxy(boxes) in_fmt = "xyxy" elif in_fmt == "cxcywh": boxes = _box_cxcywh_to_xyxy(boxes) in_fmt = "xyxy"

and let the rest of the dispatch to be done in the last branch. This way, we don't need to replicate the out dispatch logic here.

You don't need to change this here so that we can move forward quickly with this PR, but it would be good to send a follow-up PR improving this part after this PR gets merged. Thoughts?

I guess I will leave this for next PR, my original idea was we support direct conversions at some point of time and it should be simpler to refactor for future. But this too works fine.

torchvision/ops/boxes.py

fmassa · 2020-10-01T09:30:34Z

test/test_ops.py

+    # def test_bbox_convert_jit(self):
+    #     box_tensor = torch.tensor([[0, 0, 100, 100], [0, 0, 0, 0],
+    #                               [10, 15, 30, 35], [23, 35, 93, 95]], dtype=torch.float)


Two options here:

we merge the PR now and try to fix torchscript later

we fix torchscript right now.

Do you remember what type of errors you were facing? I'm fine with both approaches, so that we can move forward with this PR (but we should fix torchscript soon if we merge this without torchscript support)

oke-aditya · 2020-10-01T09:51:35Z

I will add these documentation fixes.
I'm not quite experienced with torchscript and let us fix in a new follow-up PR. I will open it as soon as this gets merged.

(I guess all the operations in boxes have support for torchscript and let's keep it in October release as well)

IIRC torchscript failed due to not finding some else block for if (I will post the error stack in the new PR)

The code can be cleaned up as you suggested, but I would like to have torchscript support first and then clean up.

test/test_ops.py

Let's leave those changes to a separate PR

oke-aditya · 2020-10-01T10:07:51Z

I guess in one of the follow up PRs, I will clean all the assert statements, there are many places I could see such inconsistent use.

fmassa

Sounds good, let's fix torchscript and the minor refactorings in a follow-up PR.

Thanks a lot @oke-aditya !

oke-aditya · 2020-10-01T10:21:53Z

Just let me add documentation. I'm about to push changes 😅

fmassa

Thanks a lot! Looking forward to the torchscript improvements!

torchvision/ops/_box_convert.py

* adds boxes conversion * adds documentation * adds xywh tests * fixes small typo * adds tests * Remove sphinx theme * corrects assertions * cleans code as per suggestion Signed-off-by: Aditya Oke <[email protected]> * reverts assertion * fixes to assertEqual * fixes inplace operations * Adds docstrings * added documentation * changes tests * moves code to box_convert * adds more tests * Apply suggestions from code review Let's leave those changes to a separate PR * fixes documentation Co-authored-by: Francisco Massa <[email protected]>

oke-aditya added 5 commits September 26, 2020 14:42

adds boxes conversion

34d86e7

adds documentation

466b43b

adds xywh tests

bd08535

fixes small typo

aeabf25

adds tests

5f997c1

pmeier reviewed Sep 27, 2020

View reviewed changes

test/test_ops.py Outdated Show resolved Hide resolved

test/test_ops.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

Remove sphinx theme

912bfba

oke-aditya added 2 commits September 27, 2020 22:47

corrects assertions

01f5bed

cleans code as per suggestion

7a1568d

Signed-off-by: Aditya Oke <[email protected]>

vfdev-5 reviewed Sep 28, 2020

View reviewed changes

test/test_ops.py Outdated Show resolved Hide resolved

pmeier reviewed Sep 28, 2020

View reviewed changes

test/test_ops.py Outdated Show resolved Hide resolved

reverts assertion

2049750

fmassa requested changes Sep 28, 2020

View reviewed changes

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

Merge branch 'master' into bbox_conv

4cf8710

oke-aditya and others added 5 commits September 28, 2020 21:02

Merge branch 'master' into bbox_conv

2bf077c

fixes to assertEqual

26d53eb

fixes inplace operations

b9cb8ee

Merge branch 'bbox_conv' of github.com:oke-aditya/vision into bbox_conv

84abd9a

Adds docstrings

d616e89

oke-aditya requested a review from fmassa September 28, 2020 17:14

fmassa reviewed Sep 29, 2020

View reviewed changes

torchvision/ops/boxes.py Outdated Show resolved Hide resolved

oke-aditya and others added 4 commits September 30, 2020 21:33

Merge branch 'master' into bbox_conv

23da6ec

added documentation

b20d211

changes tests

11b7c06

moves code to box_convert

260526a

oke-aditya requested review from fmassa and pmeier September 30, 2020 17:35

adds more tests

2db8614

fmassa reviewed Oct 1, 2020

View reviewed changes

Merge branch 'master' into bbox_conv

216b6db

fmassa reviewed Oct 1, 2020

View reviewed changes

test/test_ops.py Outdated Show resolved Hide resolved

test/test_ops.py Outdated Show resolved Hide resolved

Apply suggestions from code review

9bad964

Let's leave those changes to a separate PR

fmassa approved these changes Oct 1, 2020

View reviewed changes

fixes documentation

2425f45

fmassa approved these changes Oct 1, 2020

View reviewed changes

torchvision/ops/_box_convert.py Show resolved Hide resolved

fmassa merged commit e70c91a into pytorch:master Oct 1, 2020

oke-aditya deleted the bbox_conv branch October 1, 2020 11:26

This was referenced Oct 1, 2020

Adds torchscript Compatibility to box_convert #2737

Merged

Clean up Assert Statements. #2758

Closed

zhiqwang mentioned this pull request Nov 14, 2020

[RFC] Rotated Bounding Boxes #2761

Open

pmeier mentioned this pull request May 17, 2022

Refactor tests for ops #6027

Merged

Adds bounding boxes conversion #2710

Adds bounding boxes conversion #2710

Uh oh!

Conversation

oke-aditya commented Sep 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Sep 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pmeier left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oke-aditya commented Sep 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pmeier commented Sep 27, 2020

Uh oh!

oke-aditya commented Sep 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pmeier commented Sep 28, 2020

Uh oh!

oke-aditya commented Sep 28, 2020

Uh oh!

Uh oh!

fmassa commented Sep 29, 2020

Uh oh!

oke-aditya commented Sep 29, 2020

Uh oh!

fmassa commented Sep 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oke-aditya commented Sep 29, 2020

Uh oh!

fmassa commented Sep 29, 2020

Uh oh!

oke-aditya commented Sep 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oke-aditya commented Sep 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fmassa Oct 1, 2020

Choose a reason for hiding this comment

Uh oh!

oke-aditya Oct 1, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fmassa Oct 1, 2020

Choose a reason for hiding this comment

Uh oh!

oke-aditya commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oke-aditya commented Sep 27, 2020 •

edited

Loading

codecov bot commented Sep 27, 2020 •

edited

Loading

pmeier left a comment •

edited

Loading

oke-aditya commented Sep 27, 2020 •

edited

Loading

oke-aditya commented Sep 27, 2020 •

edited

Loading

fmassa commented Sep 29, 2020 •

edited

Loading

oke-aditya commented Sep 30, 2020 •

edited

Loading

oke-aditya commented Sep 30, 2020 •

edited

Loading

oke-aditya commented Oct 1, 2020 •

edited

Loading