Numpy casting of list of large python ints to np.float64 in cocoEval.evaluate #330

underchemist · 2019-10-16T16:56:26Z

Hi,

This is more of a cautionary tale, and is unlikely to be fixed in numpy anytime soon (See numpy/numpy#7126, numpy/numpy#5745, numpy/numpy#12525 for a sample).

I have made a custom dataset using the coco format, in which the image ids are computed as a hash of the image. These hashes are stored as the image id as integers. They tend to be rather large, and if you load them into a numpy array and check dtype, some are int64 and others are larger and thus uint64. So when I am attempting to evaluate my dataset, I might do something like

from pycocotools.coco import COCO
from pycocotools.coco import COCOeval

cocoGt = COCO('path/to/dataset.json')
cocoDt = cocoGt.loadRes('path/to/results.json')
cocoEval = COCOeval(cocoGt, cocoDt, 'bbox')
cocoEval.params.imgIds = cocoGt.getImgIds()[:10]
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()
Running per image evaluation...      
DONE (t=0.46s).
Accumulating evaluation results...   
DONE (t=0.38s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = -1
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = -1
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = -1
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1

After some digging I realized that the -1 values were default values and I've managed to track this down to

cocoEval.evaluate()
|
--> cocoeval.py: line 134: p.imgIds = list(np.unique(p.imgIds))

in this case, np.unique will cast the input object to a numpy array, which due to being a mix of np.int64 and np.uint64 will be cast to np.float64. Some examples,

>>> import numpy as np
>>> a = np.iinfo(np.int64).max
>>> b = np.iinfo(np.uint64).max
>>> np.array([a, a]).dtype
dtype('int64')
>>> np.array([b, b]).dtype
dtype('uint64')
>>> np.array([a, b]).dtype
dtype('float64')

This becomes a problem later on when p.imgIds is a list of float values and we are trying to use them as dictionary key values. For example in coco.py line 145

lists = [self.imgToAnns[imgId] for imgId in imgIds if imgId in self.imgToAnns]
# list = []

will return an empty list since large float64 values will fail the conditional in the list comprehension i.e. bool(1.83426386843345553e+18 in [183426386843345553]) # False.

My workaround to all of this is to pass a numpy array with the correct dtype to cocoEval.params.imgIds before I call evaluate(). However I'm wondering if it would be wise to do some validation on the dtype of the imgIds object, or at least raise a warning? This was a bit tricky to find as there was no obvious error.

Is it possible to avoid the np.unique call so we don't cast to a numpy array and then back to list?
if np.unique should be kept, what minimal thing should be done to avoid this situation or at least warn the user that their imgIds may not behave correctly?

I'm happy to submit a PR if this seems important, would appreciate some input on the the above points

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Numpy casting of list of large python ints to np.float64 in cocoEval.evaluate #330

Numpy casting of list of large python ints to np.float64 in cocoEval.evaluate #330

underchemist commented Oct 16, 2019

Numpy casting of list of large python ints to np.float64 in cocoEval.evaluate #330

Numpy casting of list of large python ints to np.float64 in cocoEval.evaluate #330

Comments

underchemist commented Oct 16, 2019