Skip to content
31 changes: 17 additions & 14 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -7187,7 +7187,7 @@ def abs(self):

def describe(self, percentiles=None, include=None, exclude=None):
"""
Generates descriptive statistics that summarize the central tendency,
Generate descriptive statistics that summarize the central tendency,
dispersion and shape of a dataset's distribution, excluding
``NaN`` values.

Expand Down Expand Up @@ -7231,7 +7231,18 @@ def describe(self, percentiles=None, include=None, exclude=None):

Returns
-------
summary: Series/DataFrame of summary statistics
Series or DataFrame
Summary statistics of the Series or Dataframe provided.

See Also
--------
DataFrame.count: Count number of non-NA/null observations.
DataFrame.max: Maximum of the values in the object.
DataFrame.min: Minimum of the values in the object.
DataFrame.mean: Mean of the values.
DataFrame.std: Standard deviation of the obersvations.
DataFrame.select_dtypes: Subset of a DataFrame including/excluding
columns based on their dtype.

Notes
-----
Expand Down Expand Up @@ -7275,6 +7286,7 @@ def describe(self, percentiles=None, include=None, exclude=None):
50% 2.0
75% 2.5
max 3.0
dtype: float64

Describing a categorical ``Series``.

Expand Down Expand Up @@ -7305,9 +7317,9 @@ def describe(self, percentiles=None, include=None, exclude=None):
Describing a ``DataFrame``. By default only numeric fields
are returned.

>>> df = pd.DataFrame({ 'object': ['a', 'b', 'c'],
>>> df = pd.DataFrame({ 'categorical': pd.Categorical(['d','e','f']),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 formatting here. No space after the {, spaces after the , in the Categorical.

... 'numeric': [1, 2, 3],
... 'categorical': pd.Categorical(['d','e','f'])
... 'object': ['a', 'b', 'c']
... })
>>> df.describe()
numeric
Expand Down Expand Up @@ -7393,7 +7405,7 @@ def describe(self, percentiles=None, include=None, exclude=None):
Excluding object columns from a ``DataFrame`` description.

>>> df.describe(exclude=[np.object])
categorical numeric
categorical numeric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running the validation script, I occasionally get a failure

Line 210, in pandas.DataFrame.describe
Failed example:
    df.describe(exclude=[np.number])
Expected:
           categorical object
    count            3      3
    unique           3      3
    top              f      c
    freq             1      1
Got:
           categorical object
    count            3      3
    unique           3      3
    top              f      a
    freq             1      1

Did you see this at all? This likely is an issue in the method itself, and not the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i do see this error but its flaky.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, it's probably some kind of non-stable sorting inside the describe method, and nothing wrong with the docstring. It may be best to just include the docstring, and open a new issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strange thing is that just doing

pd.DataFrame({"A": pd.Categorical(['d', 'e', 'f']), "B": ['a', 'b', 'c'], 'C': [1, 2, 3]}).describe(exclude=['number'])

seems deterministic.

count 3 3.0
unique 3 NaN
top f NaN
Expand All @@ -7405,15 +7417,6 @@ def describe(self, percentiles=None, include=None, exclude=None):
50% NaN 2.0
75% NaN 2.5
max NaN 3.0

See Also
--------
DataFrame.count
DataFrame.max
DataFrame.min
DataFrame.mean
DataFrame.std
DataFrame.select_dtypes
"""
if self.ndim >= 3:
msg = "describe is not implemented on Panel objects."
Expand Down