TST: compat with numpy 1.14 #18123

jreback · 2017-11-05T12:37:52Z

I think this is a very recent change in numpy in how ndarrays are printed. So we would conditionally change the expected if not _np_version_under1p14

https://travis-ci.org/pandas-dev/pandas/jobs/297507212

____________________ TestDataFrameDataTypes.test_astype_str ____________________
[gw0] linux -- Python 3.6.3 /home/travis/miniconda3/envs/pandas/bin/python
self = <pandas.tests.frame.test_dtypes.TestDataFrameDataTypes object at 0x7f31d2d6d748>
    def test_astype_str(self):
        # GH9757
        a = Series(date_range('2010-01-04', periods=5))
        b = Series(date_range('3/6/2012 00:00', periods=5, tz='US/Eastern'))
        c = Series([Timedelta(x, unit='d') for x in range(5)])
        d = Series(range(5))
        e = Series([0.0, 0.2, 0.4, 0.6, 0.8])
    
        df = DataFrame({'a': a, 'b': b, 'c': c, 'd': d, 'e': e})
    
        # datetimelike
        # Test str and unicode on python 2.x and just str on python 3.x
        for tt in set([str, compat.text_type]):
            result = df.astype(tt)
    
            expected = DataFrame({
                'a': list(map(tt, map(lambda x: Timestamp(x)._date_repr,
                                      a._values))),
                'b': list(map(tt, map(Timestamp, b._values))),
                'c': list(map(tt, map(lambda x: Timedelta(x)
                                      ._repr_base(format='all'), c._values))),
                'd': list(map(tt, d._values)),
                'e': list(map(tt, e._values)),
            })
    
            assert_frame_equal(result, expected)
    
        # float/nan
        # 11302
        # consistency in astype(str)
        for tt in set([str, compat.text_type]):
            result = DataFrame([np.NaN]).astype(tt)
            expected = DataFrame(['nan'])
            assert_frame_equal(result, expected)
    
            result = DataFrame([1.12345678901234567890]).astype(tt)
            expected = DataFrame(['1.12345678901'])
>           assert_frame_equal(result, expected)
pandas/tests/frame/test_dtypes.py:535: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/util/testing.py:1397: in assert_frame_equal
    obj='DataFrame.iloc[:, {idx}]'.format(idx=i))
pandas/util/testing.py:1276: in assert_series_equal
    obj='{obj}'.format(obj=obj))
pandas/_libs/testing.pyx:59: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
pandas/_libs/testing.pyx:173: in pandas._libs.testing.assert_almost_equal
    raise_assert_detail(obj, msg, lobj, robj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
obj = 'DataFrame.iloc[:, 0]'
message = 'DataFrame.iloc[:, 0] values are different (100.0 %)'
left = '[1.1234567890123457]', right = '[1.12345678901]', diff = None
    def raise_assert_detail(obj, message, left, right, diff=None):
        if isinstance(left, np.ndarray):
            left = pprint_thing(left)
        elif is_categorical_dtype(left):
            left = repr(left)
        if isinstance(right, np.ndarray):
            right = pprint_thing(right)
        elif is_categorical_dtype(right):
            right = repr(right)
    
        msg = """{obj} are different
    
    {message}
    [left]:  {left}
    [right]: {right}""".format(obj=obj, message=message, left=left, right=right)
    
        if diff is not None:
            msg += "\n[diff]: {diff}".format(diff=diff)
    
>       raise AssertionError(msg)
E       AssertionError: DataFrame.iloc[:, 0] are different
E       
E       DataFrame.iloc[:, 0] values are different (100.0 %)
E       [left]:  [1.1234567890123457]
E       [right]: [1.12345678901]
pandas/util/testing.py:1093: AssertionError

The text was updated successfully, but these errors were encountered:

jreback · 2017-11-05T12:38:20Z

in 1.13.3

In [19]: DataFrame([1.12345678901234567890]).astype(str)
Out[19]: 
               0
0  1.12345678901

cc @charris

charris · 2017-11-05T15:28:46Z

1.13.3 or current 1,14? In any case, this is probably numpy/numpy#9941. NumPy now has its own value -> string conversion functions and there will probably be some small changes in the output. However, the strings should maintain value on back conversion.

charris · 2017-11-05T15:40:59Z

Although back conversion doesn't succeed here.

In [3]: a
Out[3]: array([1.12345679])

In [4]: b = array([1.12345679])

In [5]: a == b
Out[5]: array([False], dtype=bool)

So there may be other things going on.

charris · 2017-11-05T15:42:17Z

@ahaldane Thoughts?

charris · 2017-11-05T16:24:26Z

Yeah, just looks like a printing change

In [1]: a = array([1.12345678901234567890])

In [2]: a[0]
Out[2]: 1.1234567890123457

jreback · 2017-11-05T16:26:34Z

yep i think we can just fix the test on our side

ahaldane · 2017-11-05T16:46:48Z

I see what is going on here. Numpy's casting code actually uses str with a python-float as intermediate, which drops the extra precision.

When casting from a f8 array to an S array, numpy essentially does this:

for i in range(len(arr)):
    dst[i] = str(float(src[i]))

and that's using python's str and float functions. Note that str(float) in python truncates at 10 digits while the repr prints all necessary digits.

We can see this more clearly as follows using the numpy casting loop:

>>> a = np.array([1.12345678901234567890])
>>> b = np.zeros(1, dtype='S20')
>>> b[:] = a
>>> b
array(['1.12345678901'],
      dtype='|S20')

Now let's avoid numpy's casting code by assigning directly:

>>> b[0] = a[0]
>>> b
array(['1.1234567890123457'],
      dtype='|S20')

Compare to:

>>> b[0] = str(float(a[0]))
>>> b
array(['1.12345678901'],
      dtype='|S20')

I see a few possible things we could do to make your tests work:

In numpy_gh-9941 I made the str(np.float64) output full precision. I could roll that back so it only outputs 8 digits, to be like python str(float).
It is easy to make the np.float -> str casting code use repr instead of str. That would also make it so we can round-trip floats through the casts. However, it would be a behavior change for all casts to string type. (Also, it wouldn't be right for float128)
Write more careful casting-code for np.float -> string. A lot of work, I don't really want to do it right now.

I might just try out option 1.

ahaldane · 2017-11-06T00:26:11Z

Note that this is only a problem in python2, since str(float) is truncaded only there. In python3, both str(float) and repr(float) output all the digits.

This means your test probably fails in python3, even I don't precisely understand why our recent changes affected this test the way it did. Pandas probably has an overriden astype function that calls str(np.float64()) somehow.

In any case, in numpy 1.14 we are planning not to truncate the str, even in python2. So I think you will need to add a few digits of precision here.

COMPAT: compat with numpy >= 1.14 on str repr closes pandas-dev#18123

COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes pandas-dev#18123

CI: don't show miniconda output on install COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes #18123

…s-dev#18157) CI: don't show miniconda output on install COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes pandas-dev#18123

…s-dev#18157) CI: don't show miniconda output on install COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes pandas-dev#18123 (cherry picked from commit 8dac633)

CI: don't show miniconda output on install COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes #18123 (cherry picked from commit 8dac633)

jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Novice Testing pandas testing functions or related to the test suite labels Nov 5, 2017

jreback added this to the 0.21.1 milestone Nov 5, 2017

ahaldane mentioned this issue Nov 5, 2017

BUG: str(np.float) should print with the same number of digits as python str(float) numpy/numpy#9966

Merged

ahaldane mentioned this issue Nov 6, 2017

BUG: cast to str_ should not convert to pure-python intermediate numpy/numpy#9978

Merged

jreback added a commit to jreback/pandas that referenced this issue Nov 7, 2017

CI: don't show miniconda output on install

b2c5ca5

COMPAT: compat with numpy >= 1.14 on str repr closes pandas-dev#18123

jreback added a commit to jreback/pandas that referenced this issue Nov 8, 2017

CI: don't show miniconda output on install

fb89356

COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes pandas-dev#18123

jreback mentioned this issue Nov 8, 2017

CI: don't show miniconda output on install / numpy 1.14 compat #18157

Merged

jreback added a commit to jreback/pandas that referenced this issue Nov 8, 2017

CI: don't show miniconda output on install

80cbb52

COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes pandas-dev#18123

jreback closed this as completed in #18157 Nov 8, 2017

jreback added a commit that referenced this issue Nov 8, 2017

CI: don't show miniconda output on install / numpy 1.14 compat (#18157)

8dac633

CI: don't show miniconda output on install COMPAT: compat with numpy >= 1.14 on str repr TST: temp disable python-dateutil from master closes #18123

ahaldane mentioned this issue Nov 21, 2017

Unexpected itemsize creating string array from numeric dtype array numpy/numpy#10062

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TST: compat with numpy 1.14 #18123

TST: compat with numpy 1.14 #18123

jreback commented Nov 5, 2017

jreback commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

jreback commented Nov 5, 2017

Uh oh!

ahaldane commented Nov 5, 2017 •

edited

Loading

Uh oh!

ahaldane commented Nov 6, 2017

Uh oh!

Uh oh!

TST: compat with numpy 1.14 #18123

TST: compat with numpy 1.14 #18123

Comments

jreback commented Nov 5, 2017

jreback commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

charris commented Nov 5, 2017

Uh oh!

jreback commented Nov 5, 2017

Uh oh!

ahaldane commented Nov 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahaldane commented Nov 6, 2017

Uh oh!

ahaldane commented Nov 5, 2017 •

edited

Loading