support axis=None for nanmedian ( issue #7352 ) #7440

toddrjen · 2014-06-12T13:18:52Z

This fixes #7352, where nanmedian does not work when axis==None.

toddrjen · 2014-06-12T14:26:11Z

I have added a fix (hopefully) for the rounding errors

jreback · 2014-06-12T14:29:08Z

pandas/tests/test_nanops.py

@@ -118,11 +120,39 @@ def check_results(self, targ, res, axis):
        res = getattr(res, 'values', res)
        if axis != 0 and hasattr(targ, 'shape') and targ.ndim:
            res = np.split(res, [targ.shape[0]], axis=0)[0]
-        tm.assert_almost_equal(targ, res)
+        try:


you can just pass check_less_precise=True if its a complex number (or explicty astype before the comparison)

I tried that, it doesn't fix the problem. It still raises an AssertionError even if they only differ in their 16th digit.

>>> a=np.array([1+.1111111111111111*1j]) >>> b=np.array([1+.1111111111111112*1j]) >>> tm.assert_almost_equal(a, b, check_less_precise=True) AssertionError: (1+0.1111111111111111j) != (1+0.1111111111111112j)

compare the real and imag pars separately and then it works correctly. side issue is to patch tm.assert_almost_equal to deal with complex numbers by this method

In [1]: a=np.array([1+.1111111111111111*1j]) In [2]: b=np.array([1+.1111111111111112*1j]) In [4]: tm.assert_almost_equal(a.real, b.real) Out[4]: True In [5]: tm.assert_almost_equal(a.imag, b.imag) Out[5]: True

support axis=None for nanmedian ( issue #7352 )

jreback · 2014-06-12T22:41:33Z

thanks!

jreback · 2014-06-12T23:47:19Z

@toddrjen a bunch of tests failing on windows. I debugged one of them below.

If you can debug this would be great.

C:\Users\Jeff Reback\Documents\GitHub\pandas>more test.27-64.log
..............................................S................S................S................S......................................................................................................
...........................................................................................................S...S.S....SSS......................................S........................................
..............S.....................................................S..................................................SS...............................................................................
........................................................................................................................................................................................................
.................................................................................S......................................................................................................................
.................S............S.................................................................................................................SS..........................SSSS........................
....S.....S.............................................S.................SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS....S...................................................................S.................
............SSSSSSSSSSS...................................S.S...........................................................................................................................................
............S..................................S.......................SSSSSSS...................................................................................................S......................
............................................................................................................................................S...........................................................
..........................................................................S..S..........................................................................................................................
...................................................S...S................................................................................................................................................
........................................................................................................................................................................................................
..................................................................................................................S.....................................................................................
.........................................................................S..............................................................................................................................
.................S......................................................................................................................................................................................
...................................................................C:\python27-64\lib\site-packages\numpy\core\_methods.py:55: RuntimeWarning: Mean of empty slice.
  warnings.warn("Mean of empty slice.", RuntimeWarning)
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
...SS.....S...SS.......S.S.......................SS.....S...SS.......S.S..............SS..SS.......................................................................S....................................
.........................................C:\python27-64\lib\site-packages\matplotlib\axes.py:4747: UserWarning: No labeled objects found. Use label='...' kwarg on individual plots.
  warnings.warn("No labeled objects found. "
....................................................................................................................................................C:\python27-64\lib\site-packages\matplotlib\__init__
.py:1172: UserWarning:  This call to matplotlib.use() has no effect
because the backend has already been chosen;
matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.

  warnings.warn(_use_error_msg)
..............................................................................................................................................c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win
-amd64-2.7\pandas\core\index.py:1013: RuntimeWarning: Cannot compare type 'Timestamp' with type 'str', sort order is undefined for incomparable objects
  "incomparable objects" % e, RuntimeWarning)
..................................................c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\index.py:1013: RuntimeWarning: Cannot compare type 'Timestamp' with t
ype 'long', sort order is undefined for incomparable objects
  "incomparable objects" % e, RuntimeWarning)
.................................................................................................................................................................................S.....S................
................................................S.......................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
........................................................................................................................................................................................................
...........................................................................................................................................................S............................................
......................c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py:193: ComplexWarning: Casting complex values to real discards the imaginary part
  return ~np.isfinite(values.astype('float64'))
.............E..EE.E...E......................................................................................................................................S...S.............SSS.........S...........
.SS.............SS.....SSSS.............................................................................................................................................................................
.................................................S..............................S.......................................................................................................................
..................................................................................................
======================================================================
ERROR: test_nankurt (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 373, in test_nankurt
    allow_complex=False, allow_str=False, allow_date=False)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 213, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_float', **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 43, in _f
    return f(*args, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 483, in nankurt
    count = _get_counts(mask, axis)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 542, in _get_counts
    count = (mask.shape[axis] - mask.sum(axis)).astype(float)
AttributeError: ("'long' object has no attribute 'astype'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_float', 'targar: arr_float', 'targarnan: arr_float')

======================================================================
ERROR: test_nanmax (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 320, in test_nanmax
    allow_str=False, allow_obj=False)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 241, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_tdelta', **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 88, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 418, in nanmax
    return _maybe_null_out(result, axis, mask)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 552, in _maybe_null_out
    if null_mask.any():
AttributeError: ("'bool' object has no attribute 'any'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_tdelta', 'targar: arr_tdelta', 'targarnan: arr_tdelta')

======================================================================
ERROR: test_nanmean (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 290, in test_nanmean
    allow_str=False, allow_date=False)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 213, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_float', **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 43, in _f
    return f(*args, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 88, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 261, in nanmean
    count = _get_counts(mask, axis)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 542, in _get_counts
    count = (mask.shape[axis] - mask.sum(axis)).astype(float)
AttributeError: ("'long' object has no attribute 'astype'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_float', 'targar: arr_float', 'targarnan: arr_float')

======================================================================
ERROR: test_nanmin (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 315, in test_nanmin
    allow_str=False, allow_obj=False)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 241, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_tdelta', **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 88, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 388, in nanmin
    return _maybe_null_out(result, axis, mask)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 552, in _maybe_null_out
    if null_mask.any():
AttributeError: ("'bool' object has no attribute 'any'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_tdelta', 'targar: arr_tdelta', 'targarnan: arr_tdelta')

======================================================================
ERROR: test_nanskew (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 365, in test_nanskew
    allow_complex=False, allow_str=False, allow_date=False)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 213, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_float', **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 43, in _f
    return f(*args, **kwargs)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 449, in nanskew
    count = _get_counts(mask, axis)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 542, in _get_counts
    count = (mask.shape[axis] - mask.sum(axis)).astype(float)
AttributeError: ("'long' object has no attribute 'astype'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_float', 'targar: arr_float', 'targarnan: arr_float')

----------------------------------------------------------------------
Ran 7366 tests in 516.370s

FAILED (SKIP=135, errors=5)

C:\Users\Jeff Reback\Documents\GitHub\pandas>c:\python27-64\Scripts\nosetests.exe build\lib.win-amd64-2.7\pandas\tests\test_nanops.py --pdb --pdb-failure
......C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py:193: ComplexWarning: Casting complex values to real discards the imaginary part
  return ~np.isfinite(values.astype('float64'))
.............> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py(542)_get_counts()
-> count = (mask.shape[axis] - mask.sum(axis)).astype(float)
(Pdb) l
537         return result
538
539
540     def _get_counts(mask, axis):
541         if axis is not None:
542  ->         count = (mask.shape[axis] - mask.sum(axis)).astype(float)
543         else:
544             count = float(mask.size - mask.sum())
545
546         return count
547
(Pdb) p mask
array([False, False, False, False, False, False, False, False, False,
       False, False], dtype=bool)
(Pdb) p axis
0
(Pdb) p mask.ndim
1
(Pdb) p mask.shape[axis]-mask.sum(axis)
11L
(Pdb) p mask.sum(axis)
0
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py(483)nankurt()
-> count = _get_counts(mask, axis)
(Pdb) l
478     def nankurt(values, axis=None, skipna=True):
479         if not isinstance(values.dtype.type, np.floating):
480             values = values.astype('f8')
481
482         mask = isnull(values)
483  ->     count = _get_counts(mask, axis)
484
485         if skipna:
486             values = values.copy()
487             np.putmask(values, mask, 0)
488
(Pdb) p values
array([-0.38825224,  2.25028687,  0.97792431,  0.05118711, -0.38908183,
       -1.25383019, -0.97858595,  0.50348946,  0.91971294,  0.18107761,
       -0.97499552])
(Pdb) p mask
array([False, False, False, False, False, False, False, False, False,
       False, False], dtype=bool)
(Pdb)

toddrjen · 2014-06-13T08:46:09Z

The problem seems to be that the values are getting converted to a python long instead of a numpy scalar. The workaround is easy, I can either include it in a separate patch or as part of the next one.

But could you also try the following code and see what you get?

a=np.zeros(11).astype('bool')
b=a.shape[0] - a.sum(0)
type(b)

I have tried this on linux, winpython3 x64, and pythonxy and in all of them I get numpy.int64. Based on the unit test results, I would expect that you get long or int.

toddrjen · 2014-06-13T09:14:20Z

At least looking at the last error, for nanskew, it doesn't appear that I have touched any of the code in that code path, so this doesn't seem to be a new bug.

jreback · 2014-06-13T12:59:27Z

All of the errors are related to the changes for _get_counts

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Jeff Reback>c:\python27-64\python
Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.__version__
'1.8.0'
>>> a = np.zeros(11).astype('bool')
>>> b = a.shape[0]-a.sum(0)
>>> type(b)
<type 'long'>

toddrjen · 2014-06-13T13:20:00Z

There were no changes to _get_counts. Looking at the blame, it hasn't been touched since 2011.

jreback · 2014-06-13T13:21:26Z

it wasn't broken before I merged your first PR.

toddrjen · 2014-06-13T13:23:05Z

The first version of test_nanops.py didn't test 1D arrays at all, so it wouldn't have identified this problem.

jreback · 2014-06-13T13:23:36Z

ok, its a problem now. pls have a look.

jreback added Bug labels Jun 12, 2014

jreback added this to the 0.14.1 milestone Jun 12, 2014

jreback reviewed Jun 12, 2014
View reviewed changes

support axis=None for nanmedian ( issue #7352 )

562b86e

jreback added a commit that referenced this pull request Jun 12, 2014

Merge pull request #7440 from toddrjen/nannone

326ef95

support axis=None for nanmedian ( issue #7352 )

jreback merged commit 326ef95 into pandas-dev:master Jun 12, 2014

toddrjen mentioned this pull request Jun 13, 2014

Fix bug where nanops._has_infs doesn't work with many dtypes (issue #7357) #7448

Merged

Uh oh!

support axis=None for nanmedian ( issue #7352 ) #7440

support axis=None for nanmedian ( issue #7352 ) #7440

Uh oh!

Conversation

toddrjen commented Jun 12, 2014

Uh oh!

toddrjen commented Jun 12, 2014

Uh oh!

jreback Jun 12, 2014

Choose a reason for hiding this comment

Uh oh!

toddrjen Jun 12, 2014

Choose a reason for hiding this comment

Uh oh!

toddrjen Jun 12, 2014

Choose a reason for hiding this comment

Uh oh!

jreback Jun 12, 2014

Choose a reason for hiding this comment

Uh oh!

toddrjen Jun 12, 2014

Choose a reason for hiding this comment

Uh oh!

jreback commented Jun 12, 2014

Uh oh!

jreback commented Jun 12, 2014

Uh oh!

toddrjen commented Jun 13, 2014

Uh oh!

toddrjen commented Jun 13, 2014

Uh oh!

jreback commented Jun 13, 2014

Uh oh!

toddrjen commented Jun 13, 2014

Uh oh!

jreback commented Jun 13, 2014

Uh oh!

toddrjen commented Jun 13, 2014

Uh oh!

jreback commented Jun 13, 2014

Uh oh!

Uh oh!