Skip to content

Asymmetric behavior between index and columns when getting incomplete label #17029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
toobaz opened this issue Jul 19, 2017 · 6 comments
Open
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@toobaz
Copy link
Member

toobaz commented Jul 19, 2017

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame([[1,2], [3,4]], index=pd.MultiIndex.from_tuples([['a', 'b'], ['c', '']]))

In [3]: df.loc['c'].shape
Out[3]: (1, 2)

In [4]: df.transpose().loc[:, 'c'].shape
Out[4]: (2,)

Problem description

Maybe the "fill an incomplete key with empty string(s)" rule is not implemented at all for rows? (also in light of #17024 ) If this the case, then I think it should be.

Expected Output

The same as Out[4] but reversed.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: 9e7666d python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-3-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.UTF-8 LOCALE: it_IT.UTF-8

pandas: 0.21.0.dev+265.g9e7666dae
pytest: 3.0.6
pip: 9.0.1
setuptools: None
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.1.0.dev
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: None
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.2.1

@gfyoung gfyoung added Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Bug labels Jul 19, 2017
@gfyoung
Copy link
Member

gfyoung commented Jul 19, 2017

@toobaz : I think this makes sense to me. Why would you expect the shape to be same if you transposed?

@chris-b1
Copy link
Contributor

@toobaz - agree with your diagnosis this is most likely due to empty-string level dropping magic, xref #11424. Probably could be made consistent.

@gfyoung
Copy link
Member

gfyoung commented Jul 19, 2017

@chris-b1 : Judging from your response, I'm labeling this as an API issue. I'm not sure I follow the expected out description by @toobaz . Could you explain?

@chris-b1
Copy link
Contributor

chris-b1 commented Jul 19, 2017

Sure, our basic behavior is that indexing operations that are "slice-like" (e.g. selecting an entire level) on a MultiIndex return back a DataFrame. Couple examples:

In [4]: idx = pd.MultiIndex.from_tuples([('a', ''), ('b', '1'), ('c', '1'), ('c', '2')])

In [5]: df = pd.DataFrame(np.arange(16).reshape(4,4), index=idx, columns=idx)

In [6]: df
Out[6]: 
      a   b   c    
          1   1   2
a     0   1   2   3
b 1   4   5   6   7
c 1   8   9  10  11
  2  12  13  14  15

In [7]: type(df.loc['b', :])
Out[7]: pandas.core.frame.DataFrame

In [8]: type(df.loc['c', :])
Out[8]: pandas.core.frame.DataFrame

In [9]: type(df.loc[:, 'b'])
Out[9]: pandas.core.frame.DataFrame

In [10]: type(df.loc[:, 'c'])
Out[10]: pandas.core.frame.DataFrame

But, as an undocumented "convenience" feature (linked issue), if the selection is on the columns, and all deeper levels are labeled with empty strings, the selection collapses into a Series - this collapsing doesn't happen with a row selection (this issue)

In [12]: df.loc[:, 'b']
Out[12]: 
      1
a     1
b 1   5
c 1   9
  2  13

In [13]: df.loc[:, 'a']
Out[13]: 
a        0
b  1     4
c  1     8
   2    12
Name: a, dtype: int32

In [16]: type(df.loc[:, 'a'])
Out[16]: pandas.core.series.Series

In [17]: df.loc['a', :]
Out[17]: 
  a  b  c   
     1  1  2
  0  1  2  3

In [18]: type(df.loc['a', :])
Out[18]: pandas.core.frame.DataFrame

@gfyoung
Copy link
Member

gfyoung commented Jul 19, 2017

@chris-b1 : Awesome! That definitely explained it and then some. I think I got confused by the description of the expected output. The expected shape is just the dimensions reversed (it's a transposition).

@toobaz
Copy link
Member Author

toobaz commented Jul 20, 2017

The expected shape is just the dimensions reversed (it's a transposition).

My example was maybe a bit cryptic, sorry. The thing is that a shape (1,2) when transposed gives (2,1), not (2,).

@mroeschke mroeschke added Bug and removed API Design Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

4 participants