Skip to content

NMS behaviour w.r.t. fp16 vs fp32 #3371

@SiftingSands

Description

@SiftingSands

🐛 Bug

NMS gives significantly different outputs when switching boxes from FP32 to FP16. I couldn't find any related issue here or on the discussion board, and I didn't see an obvious cause from reading the docs.

To Reproduce

Call torchvision.ops.nms(boxes, scores, iou_threshold=0.2)
boxes and scores (FP32) -> data.zip
NMS has one output.

Change boxes to float16 (I used .to(torch.float16)), and NMS gives 37 outputs (no suppression is performed?)
I wasn't expecting type conversion from FP32 to FP16 to dramatically alter results. Let me know if this is just a case of user error.

Environment

PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
GPU 4: Tesla V100-SXM2-16GB
GPU 5: Tesla V100-SXM2-16GB
GPU 6: Tesla V100-SXM2-16GB
GPU 7: Tesla V100-SXM2-16GB

Nvidia driver version: 440.33.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.7.1+cu101
[pip3] torchaudio==0.7.2
[pip3] torchvision==0.8.2+cu101
[conda] Could not collect

Possible related issue?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions