-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
based on mailing list discussion, adding here for better traceability.
Is there a way to use wild cards when using MultiIndex?
As an example, let`s work with the following DataFrame:
In [10]: df
Out[10]:
0 1 2 3 4
x y z -0.02688 1.236 -1.174 -1.026 1.508
a b c 0.9156 1.742 0.4483 1.005 0.3641
p q r 0.2019 -0.3895 -0.8484 0.9992 -0.2274
Selecting all rows where level 0 of the MultiIndex == 'a' is easy:
In [11]: df.ix['a']
Out[11]:
0 1 2 3 4
b c 0.9156 1.742 0.4483 1.005 0.3641
But what if i want to select all rows where level 3 of the MultiIndex == 'c', and i don`t care what the other levels are?
This is what can be done:
In [12]: df.swaplevel(0, 2).swaplevel(1, 2).ix['c']
Out[12]:
0 1 2 3 4
a b 0.9156 1.742 0.4483 1.005 0.3641
Wes idea is that this merits a new API function or adding to an existing one.
e.g. to modify DataFrame.xs to work like:
df.xs('c', axis=0, level=2)
Other idea is to do something something similar as in numpy, and use : as wildcard.
For example in numpy:
In [7]: y = np.arange(27).reshape(3,3,3)
In [8]: y
Out[8]: array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
In [9]: y[:, :, 0]
Out[9]: array([[ 0, 3, 6],
[ 9, 12, 15],
[18, 21, 24]])
y[:, :, 0] indexes all elements with "level 2" == 0.
This would translate to df.ix[:, :, 'c'] on the first example.
Another example on a DataFrames with MultIndex with a lot of levels and selecting on more than one level: df.ix[:, :, 'c', :, 10]
This approach looks familiar, and no other api method is needed for indexing when using wildcard.