Skip to content

BUG: .loc indexing not preserving Index type #15166

Closed
@watercrossing

Description

@watercrossing

test = pd.DataFrame(data=np.arange(2,22,2), 
             index=pd.MultiIndex(levels=[pd.CategoricalIndex(["a", "b"]), range(10)],
                                 labels=[[0]*5 + [1]*5, range(10)],
                                 names = ["Index1", "Index2"]))
test.index.levels[0]
# CategoricalIndex([u'a', u'b'], categories=[u'a', u'b'], ordered=False, name=u'Index1', dtype='category')
test.loc[["a"]].index.levels[0]
# Index([u'a'], dtype='object', name=u'Index1')

Problem description

When selecting data through a categorical index, the categorical index is lost.

Expected Output

test.loc[["a"]].index.levels[0]
# CategoricalIndex([u'a'], categories=[u'a', u'b'], ordered=False, name=u'Index1', dtype='category')

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-431.11.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_GB.utf8
LANG: en_GB.utf8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 32.3.1
Cython: None
numpy: 1.11.3
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions