Skip to content

Adapted functional tensor tests on CPU/CUDA #2569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 20, 2020

Conversation

vfdev-5
Copy link
Collaborator

@vfdev-5 vfdev-5 commented Aug 10, 2020

Related to #2292 (comment)

Description:

TODO:

  • Adapt test_adjustments
  • Adapt test_rgb_to_grayscale

- fixed bug with transforms using generated grid
- remains *_crop, blocked by pytorch#2568
- TODO: test_adjustments
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @vfdev-5 !

I've left a few comments, let me know what you think

@@ -15,18 +15,31 @@
import torchvision.transforms.functional as F


def run_on_cpu_and_cuda(fn):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, and is very similar to the PyTorch equivalent onlyOnCPUandCUDA.

One thing I would like to double-check: if the test fails on CUDA (or CPU) for some reason, does the error message indicate that the issue was on CUDA?

I think @mruberry can also give some insights / feedback on this, given that he has worked on this on PyTorch side

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat but it won't indicate which (sub)test, CPU or CUDA, failed. The test itself could include the device string in error messages it prints when it fails.

This is a lightweight way of running the same test on the CPU and CUDA so maybe that's OK. A question that we may want to follow-up on is whether torchvision can just use PyTorch's device generic test framework.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry yes, locally I quickly used subTest but running with pytest it does not show the device neither...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could get it to work with subtest, but I think to support subtests in pytest you need a plugin? For the PyTorch CI (not sure about torchvision) we actually void using subtest for that reason.

Copy link
Collaborator Author

@vfdev-5 vfdev-5 Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when test is failing on a device, test's input args are shown like that:

______________________________________________ Tester.test_vflip _________________________________________________________
                                                        
self = <test_functional_tensor.Tester testMethod=test_vflip>, args = (), kwargs = {}, device = 'cuda'  

So, maybe this is sufficient as we can see on which device it fails. What do you think ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a test fails, does it still run the remaining tests? I did not look into the detail but the function name looks like it executes all the tests sequentially and one failure would stop the rest from running. This can be very painful dev experience.

Also can this test run CPU and GPU tests separately? I could be wrong but, as far as I know, you cannot run GPU tests in fbcode (so far), and when you define test suites in fbcode, one needs to separate test suites on file-level.

Copy link
Collaborator Author

@vfdev-5 vfdev-5 Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a test fails, does it still run the remaining tests?

@mthrok it executes all self.test_* as usually, but it wont continue on "cuda" device case if "cpu" subtest is failed which is IMO is OK.
Otherwise, I can split all tests into two, such that everything is explicit.

Also can this test run CPU and GPU tests separately?

It adds "cuda" device if torch.cuda.is_available() is True, so if there is no GPUs it still can run on CPU... I agree that the name run_on_cpu_and_cuda can be misleading.

Let me know if you have other thoughts.

Copy link
Contributor

@mthrok mthrok Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a test fails, does it still run the remaining tests?

@mthrok it executes all self.test_* as usually, but it wont continue on "cuda" device case if "cpu" subtest is failed which is IMO is OK.

I have an opposite opinion. Sometimes I have to iterate CI jobs to fix something. It is easier if I can see all the failure from the beginning, so that I can locate the failure points, and I do not have to run CI multiple times and do not have to wait for them. (of course these I am aware that I should be able to run these tests locally too)

FYI: I have made everything separate and explicit in torchaudio for this reason. I am not advocating that torch vision should do the same approach. I do not know as much about torch vision, so I understand if there are already established customs too, which does not align with this approach.

https://github.com/pytorch/audio/blob/master/test/torchaudio_unittest/torchscript_consistency_impl.py
https://github.com/pytorch/audio/blob/master/test/torchaudio_unittest/torchscript_consistency_cpu_test.py
https://github.com/pytorch/audio/blob/master/test/torchaudio_unittest/torchscript_consistency_cuda_test.py

Let me know if you have other thoughts.

Even though the tests on GPU are automatically skipped if there is no CUDA available, if these tests are skipped all the time, fbcode will issue warnings. So it is ideal if CUDA tests are separated explicitly.

Copy link
Collaborator Author

@vfdev-5 vfdev-5 Aug 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mthrok thanks for the links, torch audio unit tests look clean and well split 👍

I agree that it would be better to split it in torchvision too at least as different tests

class Tester(unittest.TestCase):

    def _test_vflip(self, device):
        # implementation

    def test_vflip_cpu(self):
        self._test_vflip("cpu")

    @skipIfNoCuda
    def test_vflip_cuda(self):
        self._test_vflip("cuda")

Even though the tests on GPU are automatically skipped if there is no CUDA available, if these tests are skipped all the time, fbcode will issue warnings. So it is ideal if CUDA tests are separated explicitly.

Here, I have no idea how it works with fbcode, I let @fmassa to decide on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About fbcode, let's not worry about that for now, and let's just skip the tests as normally

- Merge branch 'master' of github.com:pytorch/vision into vfdev-5/issue-2292-on-cuda

img_tensor, pil_img = self._create_data(16, 18)
def _test_crop(self, device):
script_crop = torch.jit.script(F.crop)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grid = _gen_affine_grid(theta, w=shape[-1], h=shape[-2], ow=shape[-1], oh=shape[-2])
mode = _interpolation_modes[resample]
return _apply_grid_transform(img, grid, mode)


def _compute_output_size(theta: Tensor, w: int, h: int) -> Tuple[int, int]:
def _compute_output_size(matrix: List[float], w: int, h: int) -> Tuple[int, int]:
Copy link
Collaborator Author

@vfdev-5 vfdev-5 Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the input: theta -> matrix, as theta can be on CUDA and it is much slower to compute output size on CUDA than on CPU

@vfdev-5 vfdev-5 force-pushed the vfdev-5/issue-2292-on-cuda branch from 8196a52 to e1d54f1 Compare August 12, 2020 22:15
@vfdev-5 vfdev-5 changed the title Adapted almost all functional tensor tests on CPU/CUDA Adapted functional tensor tests on CPU/CUDA Aug 13, 2020
Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

I've only one comment that can be addressed in a follow-up PR.

@@ -880,7 +885,7 @@ def perspective(
)

ow, oh = img.shape[-1], img.shape[-2]
grid = _perspective_grid(perspective_coeffs, ow=ow, oh=oh)
grid = _perspective_grid(perspective_coeffs, ow=ow, oh=oh, device=img.device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a future PR: I'm not sure this will work as expected if the input tensor is float64 (or float16).

@@ -828,19 +832,20 @@ def _perspective_grid(coeffs: List[float], ow: int, oh: int):
theta1 = torch.tensor([[
[coeffs[0], coeffs[1], coeffs[2]],
[coeffs[3], coeffs[4], coeffs[5]]
]])
]], dtype=torch.float, device=device)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the future: maybe we might want to pass a dtype to this function as well, otherwise other functions might fail for float64

if isinstance(pic, torch.Tensor):
npimg = np.transpose(pic.numpy(), (1, 2, 0))
if pic.is_floating_point() and mode != 'F':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this seems to actually fix a bug where if the image was double it wouldn't be multiplied by 255 and cast to byte

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please develop this note. Do we need to do something here ?
I'm writing a follow-up issue to address the improvements...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure we properly track it somehow so that we can mention this in the release notes

@fmassa fmassa merged commit 3970299 into pytorch:master Aug 20, 2020
bryant1410 pushed a commit to bryant1410/vision-1 that referenced this pull request Nov 22, 2020
* Adapted almost all functional tensor tests on CPU/CUDA
- fixed bug with transforms using generated grid
- remains *_crop, blocked by pytorch#2568
- TODO: test_adjustments

* Apply suggestions from code review

Co-authored-by: Francisco Massa <[email protected]>

* Fixed issues according to review

* Split tests into two: cpu and cuda

* Updated test_adjustments to run on CPU and CUDA

Co-authored-by: Francisco Massa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants