Skip to content

Labels mismatch when adding series with repeated index values in 0.19rc1 #14227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrocklin opened this issue Sep 15, 2016 · 7 comments · Fixed by #14230
Closed

Labels mismatch when adding series with repeated index values in 0.19rc1 #14227

mrocklin opened this issue Sep 15, 2016 · 7 comments · Fixed by #14230
Labels
Bug Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@mrocklin
Copy link
Contributor

This showed up in the dask.dataframe test suite when testing against the 0.19 release candidate. I'm unsure if this was an intended change or not:

Pandas 0.18

In [1]: import pandas as pd

In [2]: a = pd.Series([1, 2], index=[1, 1])

In [3]: b = pd.Series([10, 10], index=[1, 2])

In [4]: a
Out[4]: 
1    1
1    2
dtype: int64

In [5]: b
Out[5]: 
1    10
2    10
dtype: int64

In [6]: a + b
Out[6]: 
1    11.0
1    12.0
2     NaN
dtype: float64

Pandas 0.19rc1

In [1]: import pandas as pd

In [2]: a = pd.Series([1, 2], index=[1, 1])

In [3]: b = pd.Series([10, 10], index=[1, 2])

In [4]: a
Out[4]: 
1    1
1    2
dtype: int64

In [5]: b
Out[5]: 
1    10
2    10
dtype: int64

In [6]: a + b
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-f96fb8f649b6> in <module>()
----> 1 a + b

/home/mrocklin/Software/anaconda/lib/python3.5/site-packages/pandas/core/ops.py in wrapper(left, right, name, na_op)
    671             return NotImplemented
    672 
--> 673         left, right = _align_method_SERIES(left, right)
    674 
    675         converted = _Op.get_op(left, right, name, na_op)

/home/mrocklin/Software/anaconda/lib/python3.5/site-packages/pandas/core/ops.py in _align_method_SERIES(left, right, align_asobject)
    613                                                 return_indexers=True)
    614             # if DatetimeIndex have different tz, convert to UTC
--> 615             left.index = index
    616             right.index = index
    617 

/home/mrocklin/Software/anaconda/lib/python3.5/site-packages/pandas/core/generic.py in __setattr__(self, name, value)
   2754         try:
   2755             object.__getattribute__(self, name)
-> 2756             return object.__setattr__(self, name, value)
   2757         except AttributeError:
   2758             pass

pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44336)()

/home/mrocklin/Software/anaconda/lib/python3.5/site-packages/pandas/core/series.py in _set_axis(self, axis, labels, fastpath)
    306         object.__setattr__(self, '_index', labels)
    307         if not fastpath:
--> 308             self._data.set_axis(axis, labels)
    309 
    310     def _set_subtyp(self, is_all_dates):

/home/mrocklin/Software/anaconda/lib/python3.5/site-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
   2774             raise ValueError('Length mismatch: Expected axis has %d elements, '
   2775                              'new values have %d elements' %
-> 2776                              (old_len, new_len))
   2777 
   2778         self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 3 elements, new values have 5 elements
@jreback
Copy link
Contributor

jreback commented Sep 15, 2016

@mrocklin
Copy link
Contributor Author

Alrighty. Thanks for the pointer.

@jorisvandenbossche
Copy link
Member

I don't think this change was intentional. The whatsnew also says "Arithmetic operators align both index (no changes)."

@jorisvandenbossche jorisvandenbossche added this to the 0.19.0 milestone Sep 15, 2016
@TomAugspurger
Copy link
Contributor

@jorisvandenbossche agreed that it wasn't intentional, but does the old behavior make sense under the new equality rules (or the old ones for that matter)?

@jorisvandenbossche
Copy link
Member

If we go from the "align both index" description (of course not sure if this is also how it is in the docs or is meant initially), I would expect this yes:


In [19]: a_temp, b_temp = a.align(b)

In [20]: a_temp
Out[20]: 
1    1.0
1    2.0
2    NaN
dtype: float64

In [21]: b_temp
Out[21]: 
1    10
1    10
2    10
dtype: int64

In [22]: a_temp + b_temp
Out[22]: 
1    11.0
1    12.0
2     NaN

@TomAugspurger
Copy link
Contributor

Whoops, yeah obviously that's the expected output.

@jorisvandenbossche
Copy link
Member

I wouldn't call it that "obvious" :-), but it does follow from the rules in some way

@jreback jreback modified the milestones: 0.19.0, 0.19.1 Sep 28, 2016
@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Sep 29, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.19.0, 0.19.1 Sep 30, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants