Skip to content

Fix bug where nanops._has_infs doesn't work with many dtypes (issue #7357) #7448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 13, 2014
Merged

Conversation

toddrjen
Copy link
Contributor

Fixes issue #7357, where where nanops._has_infs doesn't work with many dtypes

@toddrjen
Copy link
Contributor Author

Travis CI passes, but this does NOT include the fix for the bug mentioned in #7440

@jreback jreback added this to the 0.14.1 milestone Jun 13, 2014
@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

ok, need to fix #7440 first

@toddrjen
Copy link
Contributor Author

This version should fix the bug identified in #7440.

@@ -538,12 +542,9 @@ def _maybe_arg_null_out(result, axis, mask, skipna):


def _get_counts(mask, axis):
if axis is not None:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix for the issue in #7440 is here

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

closer


C:\Users\Jeff Reback\Documents\GitHub\pandas>c:\python27-64\Scripts\nosetests.exe build\lib.win-amd64-2.7\pandas\tests\test_nanops.py
......C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py:197: ComplexWarning: Casting complex values to real discards the imaginary part
  return ~np.isfinite(values.astype('float64'))
................EF.E......
======================================================================
ERROR: test_nanmax (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 320, in test_nanmax
    allow_str=False, allow_obj=False)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 241, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_tdelta', **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 88, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 422, in nanmax
    return _maybe_null_out(result, axis, mask)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 553, in _maybe_null_out
    if null_mask.any():
AttributeError: ("'bool' object has no attribute 'any'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_tdelta', 'targar: arr_tdelta', 'targarnan: arr_tdelta')

======================================================================
ERROR: test_nanmin (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 315, in test_nanmin
    allow_str=False, allow_obj=False)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 241, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_tdelta', **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 188, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 160, in check_fun_data
    **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 88, in f
    result = alt(values, axis=axis, skipna=skipna, **kwds)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 392, in nanmin
    return _maybe_null_out(result, axis, mask)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\core\nanops.py", line 553, in _maybe_null_out
    if null_mask.any():
AttributeError: ("'bool' object has no attribute 'any'", 'axis: 0 of 0', 'skipna: False', 'kwargs: {}', 'testar: arr_tdelta', 'targar: arr_tdelta', 'targarnan: arr_tdelta')

======================================================================
FAIL: test_nanmean (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 290, in test_nanmean
    allow_str=False, allow_date=False)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 216, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_int', **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 161, in check_fun_data
    self.check_results(targ, res, axis)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 124, in check_results
    tm.assert_almost_equal(targ, res)
  File "testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2465)
  File "testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)
  File "testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)
  File "testing.pyx", line 136, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2299)
AssertionError: ('expected -2.00000 but got -1.63636, with decimal 5', 'axis: 0 of 2', 'skipna: False', 'kwargs: {}', 'testar: arr_int', 'targar: arr_int', 'targarnan: arr_int')

----------------------------------------------------------------------
Ran 32 tests in 1.047s

FAILED (errors=2, failures=1)

C:\Users\Jeff Reback\Documents\GitHub\pandas>

@toddrjen
Copy link
Contributor Author

That is a separate problem resulting from the same underlying issue with scalars being converted to python native objects. I am making a fix for these new issues now, but I don't think there is any way for me to fix it generally.

Can you do me a favor and tell me what happens when you do these:

a=np.zeros(11).astype('bool')
b=a.shape[0]
type(b)
a=np.zeros(11).astype('bool')
b=a.sum(0)
type(b)
a=np.zeros(11).astype('bool')
b=a.shape[0] - a.sum(0, dtype=a.dtype)
type(b)

Whatever the case, this seems to be an upstream numpy issue. I can work around it easily enough here, and I can't find any more code in nanops that looks like it will trigger it, but there is no telling where else it might appear. You should probably see if you can reproduce it in numpy 1.9rc, and if so report it upstream. I can't reproduce it on any of my setups so there isn't any point my reporting it.

@toddrjen
Copy link
Contributor Author

This version should also fix the new issue.

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

windows is an odd beast - numpy does some odd things like type conversions (because the windows default int is int32 EVEN ON 64-bit), really weird. not a numpy bug but just some idiosyncratic behavior.

here's the results..

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Jeff Reback>c:\python27-64\python
Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> a=np.zeros(11).astype('bool')
>>> b=a.shape[0]
>>> type(b)
<type 'long'>
>>> a=np.zeros(11).astype('bool')
>>> b=a.sum(0)
>>> type(b)
<type 'numpy.int32'>
>>> a=np.zeros(11).astype('bool')
>>> b=a.shape[0] - a.sum(0, dtype=a.dtype)
>>> type(b)
<type 'numpy.int64'>
>>>

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

ok fixed those other 2....last one is here

.................F........
======================================================================
FAIL: test_nanmean (pandas.tests.test_nanops.TestnanopsDataFrame)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 290, in test_nanmean
    allow_str=False, allow_date=False)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 216, in check_funs
    self.check_fun(testfunc, targfunc, 'arr_int', **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 202, in check_fun
    testarval, targarval, targarnanval, **kwargs)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 161, in check_fun_data
    self.check_results(targ, res, axis)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-2.7\pandas\tests\test_nanops.py", line 124, in check_results
    tm.assert_almost_equal(targ, res)
  File "testing.pyx", line 58, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2465)
  File "testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)
  File "testing.pyx", line 93, in pandas._testing.assert_almost_equal (pandas\src\testing.c:1793)
  File "testing.pyx", line 136, in pandas._testing.assert_almost_equal (pandas\src\testing.c:2299)
AssertionError: ('expected -2.00000 but got -1.63636, with decimal 5', 'axis: 0 of 2', 'skipna: False', 'kwargs: {}', 'testar: arr_int', 'targar: arr_int', 'targarnan: arr_int')

----------------------------------------------------------------------
Ran 32 tests in 1.013s

FAILED (failures=1)

C:\Users\Jeff Reback\Documents\GitHub\pandas>

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

(Pdb) l
125             except:
126                 # There are sometimes rounding errors with
127                 # complex and object dtypes.
128                 # If it isn't one of those, re-raise the error.
129                 if not hasattr(res, 'dtype') or res.dtype.kind not in ['c', 'O']:
130  ->                 raise
131                 # convert object dtypes to something that can be split into
132                 # real and imaginary parts
133                 if res.dtype.kind == 'O':
134                     if targ.dtype.kind != 'O':
135                         res = res.astype(targ.dtype)
(Pdb) l
136                     else:
137                         try:
138                             res = res.astype('c16')
139                         except:
140                             res = res.astype('f8')
141                         try:
142                             targ = targ.astype('c16')
143                         except:
144                             targ = targ.astype('f8')
145                 # there should never be a case where numpy returns an object
146                 # but nanops doesn't, so make that an exception
(Pdb) p targ
array([[-1.63636364,  1.        , -0.54545455,  2.        , -1.90909091],
       [-3.27272727, -0.09090909, -3.54545455, -0.90909091, -0.90909091],
       [-0.90909091, -2.18181818, -0.45454545, -3.72727273,  0.27272727],
       [-1.72727273, -0.09090909, -0.09090909, -1.09090909, -3.63636364],
       [ 1.72727273, -4.18181818, -0.72727273,  0.        , -1.        ],
       [ 0.36363636, -0.72727273, -2.45454545,  2.72727273, -3.72727273],
       [ 0.81818182, -3.        ,  1.        ,  0.63636364, -0.72727273]])
(Pdb) p targ.dtype
dtype('float64')
(Pdb) p res
array([[-2,  1, -1,  2, -2],
       [-4, -1, -4, -1, -1],
       [-1, -3, -1, -4,  0],
       [-2, -1, -1, -2, -4],
       [ 1, -5, -1,  0, -1],
       [ 0, -1, -3,  2, -4],
       [ 0, -3,  1,  0, -1]], dtype=int64)
(Pdb) p res.dtype
dtype('int64')

@toddrjen
Copy link
Contributor Author

Similar thing again. Try this version.

@toddrjen
Copy link
Contributor Author

Wait, no, give me a second.

@toddrjen
Copy link
Contributor Author

Alright, now try.

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

same result

@toddrjen
Copy link
Contributor Author

Try this version please

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

looks good!

ok...ping when passes travis and we'll get this in

thanks!

@toddrjen
Copy link
Contributor Author

Travis tests pass

jreback added a commit that referenced this pull request Jun 13, 2014
Fix bug where ``nanops._has_infs`` doesn't work with many dtypes (issue #7357)
@jreback jreback merged commit 9659602 into pandas-dev:master Jun 13, 2014
@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

ok.gr8!

@toddrjen
Copy link
Contributor Author

Also fixes #7415

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants