Skip to content

df.groupby(group_keys=True) sometimes doesn't do anything #26805

@ghost

Description

According to the df.groupby docstring :

group_keys : bool, default True
    When calling apply, add group keys to index to identify pieces.

But it seems to work for some cases and not others:

df = pd.DataFrame({'key': [1, 1, 1, 2, 2, 2, 3, 3, 3],
                        'value': range(9)})

df.groupby('key', group_keys=True).apply(lambda x: x.key)  # index by groups 
df.groupby('key', group_keys=True).apply(pd.np.sum) # index by groups
df.groupby('key', group_keys=True).apply(lambda x: x[:].key) # index by groups
df.groupby('key', group_keys=True).apply(lambda x:x-x.mean()) # does nothing
df.groupby('key', group_keys=True).apply(lambda x:x) # does nothing

For example, the following gives the same output regardles of the group_keys value

import pandas as pd
df=pd.DataFrame(dict(price=[10,10,20,20,30,30],color=[10,10,20,20,30,30],cost=(100,200,300,400,500,600)))
df.groupby(['price'],group_keys=False).apply(lambda x:x)
# result
   price  color  cost
0     13     11   101
1     11     11   201
2     22     21   301
3     21     21   401
4     32     31   501
5     31     31   601

df.groupby(['price'],group_keys=True).apply(lambda x:x)
# same result
   price  color  cost
0     13     11   101
1     11     11   201
2     22     21   301
3     21     21   401
4     32     31   501
5     31     31   601

xref #22545 for related groupby confusion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions