Skip to content

fix crash on unicode level names #9650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

alavrik
Copy link

@alavrik alavrik commented Mar 13, 2015

This fixed my problem when calling DataFrame.stack(). Not really familiar with pandas internals, but couldn't find any reason why unicode strings won't be accepted here.

@jreback
Copy link
Contributor

jreback commented Mar 13, 2015

tests would be needed

@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Unicode Unicode strings Compat pandas objects compatability with Numpy or Python functions labels Mar 13, 2015
@alavrik
Copy link
Author

alavrik commented Mar 14, 2015

To be honest, I don't know pandas internals/tests and I don't use python3. Please consider this as a bug report. Let me know if you need help with reproducing it.

@jreback
Copy link
Contributor

jreback commented Mar 14, 2015

@alavrik

can you show some code which reproduces the error?

the point of a bug report is to say, here is what I did, here's how to reproduce it, its doing x, but I think it should by doing y

we cannot simply change code because it 'looks right'. then what happens when someone decided that some other way looks right?

this is not about knowing pandas internals or not, its about giving a bug report that can lead to tests that can lead to fixed code

sorry for being harsh, but I find lots of people writing code w/o even thinking about testing it.

@alavrik
Copy link
Author

alavrik commented Mar 23, 2015

@jreback
You are right. Here's the test (Python 2.7.5, Pandas 0.15.2):

import pandas as pd
i = pd.MultiIndex(
    levels=[[u'foo', u'bar'], [u'one', u'two'], [u'a', u'b']],
    labels=[[0, 0, 1, 1], [0, 1, 0, 1], [1, 0, 1, 0]],
    names=[u'first', u'second', u'third'])
s = pd.Series(0, index=i)
s.unstack([1, 2]).stack(0)

and this is the last couple of frames of the error output:

/Users/alavrik/devel/ipython/ve/lib/python2.7/site-packages/pandas/core/index.pyc in sortlevel(self, level, ascending, sort_remaining)
   3868         if isinstance(level, (str, int)):
   3869             level = [level]
-> 3870         level = [self._get_level_number(lev) for lev in level]
   3871 
   3872         # partition labels and shape

/Users/alavrik/devel/ipython/ve/lib/python2.7/site-packages/pandas/core/index.pyc in _get_level_number(self, level)
   3166         except ValueError:
   3167             if not isinstance(level, int):
-> 3168                 raise KeyError('Level %s not found' % str(level))
   3169             elif level < 0:
   3170                 level += self.nlevels

KeyError: 'Level t not found'

It is easy to see what's going on. Instead of iterating over [level], it is trying to iterate over the characters of the unicode level name. In this case, it starts with the character t of the level two.

@jreback jreback added this to the 0.16.1 milestone Mar 23, 2015
@jreback
Copy link
Contributor

jreback commented Apr 11, 2015

closing in favor of #9856 bug report

@jreback jreback closed this Apr 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Reshaping Concat, Merge/Join, Stack/Unstack, Explode Unicode Unicode strings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants