Skip to content

DOC: update NDFrame.squeeze docstring #20269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 7, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 94 additions & 4 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -699,18 +699,108 @@ def pop(self, item):

def squeeze(self, axis=None):
"""
Squeeze length 1 dimensions.
Squeeze 1 dimensional axis objects into scalars.

Series or DataFrames with a single element are squeezed to a scalar.
DataFrames with a single column or a single row are squeezed to a
Series. Otherwise the object is unchanged.

This method is most useful when you don't know if your
object is a Series or DataFrame, but you do know it has just a single
column. In that case you can safely call `squeeze` to ensure you have a
Series.

Parameters
----------
axis : None, integer or string axis name, optional
The axis to squeeze if 1-sized.
axis : axis : {0 or ‘index’, 1 or ‘columns’, None}, default None
A specific axis to squeeze. By default, all length-1 axes are
squeezed.

.. versionadded:: 0.20.0

Returns
-------
scalar if 1-sized, else original object
DataFrame, Series, or scalar
The projection after squeezing `axis` or all the axes.

See Also
--------
Series.iloc : Integer-location based indexing for selecting scalars
DataFrame.iloc : Integer-location based indexing for selecting Series
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add DataFrame.to_series? I think they do the same in the case of 1 column DataFrame, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is somewhat, I think squeezing is most useful in slicing scenarios but perhaps someone might find that a direct conversion is what they really wanted. Will add it too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think DataFrame.to_series exists?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, that's a good reason to not add it... ;) not sure what I was thinking about, I think I got confused with Index.to_series, sorry

Series.to_frame : Inverse of DataFrame.squeeze for a
single-column DataFrame.

Examples
--------
>>> primes = pd.Series([2, 3, 5, 7])

Slicing might produce a Series with a single value:

>>> even_primes = primes[primes % 2 == 0]
>>> even_primes
0 2
dtype: int64

>>> even_primes.squeeze()
2

Squeezing objects with more than one value in every axis does nothing:

>>> odd_primes = primes[primes % 2 == 1]
>>> odd_primes
1 3
2 5
3 7
dtype: int64

>>> odd_primes.squeeze()
1 3
2 5
3 7
dtype: int64

Squeezing is even more effective when used with DataFrames.

>>> df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])
>>> df
a b
0 1 2
1 3 4

Slicing a single column will produce a DataFrame with the columns
having only one value:

>>> df_a = df[['a']]
>>> df_a
a
0 1
1 3

So the columns can be squeezed down, resulting in a Series:

>>> df_a.squeeze('columns')
0 1
1 3
Name: a, dtype: int64

Slicing a single row from a single column will produce a single
scalar DataFrame:

>>> df_0a = df.loc[df.index < 1, ['a']]
>>> df_0a
a
0 1

Squeezing the rows produces a single scalar Series:

>>> df_0a.squeeze('rows')
a 1
Name: 0, dtype: int64

Squeezing all axes wil project directly into a scalar:

>>> df_0a.squeeze()
1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this example using df covers what it's shown first with the primes. I'd leave just this one, personally I find it really good, and enough to not have to list the previous.

Copy link
Contributor Author

@villasv villasv Mar 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to have examples with both Series and DataFrames because both classes share this docstring, so it would be a bit weird to read the docs from Series.squeeze and find examples only of DataFrame.squeeze. But I think I could "chain" those examples, since in the middle of the df example I may squeeze some Series as well.

I'll try to merge both, because the Series example is more concrete and related do slicing (the most likely use case IMO), but the second covers both classes.

"""
axis = (self._AXIS_NAMES if axis is None else
(self._get_axis_number(axis),))
Expand Down