Skip to content

BUG: categorical/string Series hist() method produces confusing bar plot. #22091

Closed
@nmusolino

Description

@nmusolino

Code Sample

In [1]: import pandas

In [5]:  s = pandas.Series(list('AAAABBC'))

In [6]: s
Out[6]: 
0    A
1    A
2    A
3    A
4    B
5    B
6    C
dtype: object

In [7]: ax = s.hist()

In [8]: ax.figure.savefig('actual.png')

Problem description

The hist() method creates a plot with multiple colored bars at each label on the x-axis.

actual

This kind of plot is not informative and should not be produced.

The same result is obtained using s.astype('category').

Expected Output

Either a plot of value counts (see below), OR, alternatively, an exception could be raised.

In [9]: ax = s.value_counts().plot.bar(color='darkblue')

In [10]: ax.figure.savefig('expected.png')

expected

Note that DataFrame.hist() omits object columns, and s.to_frame('i').hist() raises an exception.

Discussion

This is essentially the same question as discussed in #8712 (November 2014). In that issue, the proposal is to create a plot instead of raising an error. This issue is that producing a bad plot is a bug.

cc @jreback , reporter of #8712

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 114f415
python: 3.7.0.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+365.g114f41534.dirty
pytest: 3.6.3
pip: 10.0.1
setuptools: 39.0.1
Cython: 0.28.4
numpy: 1.15.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.6
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions