Skip to content

Conversation

rhshadrach
Copy link
Member

There are two cases we want to emit a deprecation warning for DataFrameGroupBy:

  • numeric_only is not specified and columns get dropped. In this case emit a warning that the default of numeric_only will change to False in the future.
  • numeric_only is specified to False and columns still get dropped. In this case emit a warning that the op will raise in the future.

@rhshadrach rhshadrach added Groupby Deprecate Functionality to remove in pandas Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply labels May 14, 2022
@rhshadrach rhshadrach added this to the 1.5 milestone May 14, 2022
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of filterwarnings, i know its a bit annoying but can you either explicity test or just pass numeric_only=False?

libgroupby.group_var,
cython_dtype=np.dtype(np.float64),
numeric_only=numeric_only,
needs_counts=True,
post_processing=lambda vals, inference: np.sqrt(vals),
ddof=ddof,
)
if (
self.axis != 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a helper for this rather than repeating?

@@ -81,6 +81,7 @@ def get_stats(group):
assert result.index.names[0] == "C"


@pytest.mark.filterwarnings("ignore:.*value of numeric_only.*:FutureWarning")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explictily test these rather than filtering (alt pass numeric_only=False) as needed

@rhshadrach
Copy link
Member Author

Thanks @jreback; filterwarnings has been removed and the helper had been added.

@jreback jreback merged commit 7c054d6 into pandas-dev:main May 18, 2022
@jreback
Copy link
Contributor

jreback commented May 18, 2022

very nice @rhshadrach

@rhshadrach rhshadrach deleted the depr_groupby_numeric_only branch May 18, 2022 12:57
@twoertwein
Copy link
Member

I think this causes the doc build to fail with multiple of these warnings:

:1: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
(bb.groupby(['year', 'team']).sum()

Would probably need to update the documentation to avoid these FutureWarnings.

@rhshadrach
Copy link
Member Author

Thanks @twoertwein - will do a follow up.

mroeschke added a commit that referenced this pull request May 25, 2022
* TYP: NoDefault

* ix mypy issues; re-write isinstance(..., NoDefault)

* remove two more casts

* ENH: DatetimeArray fields support non-nano (#47044)

* DEPR: groupby numeric_only default (#47025)

* DOC: Clarify decay argument validation in ewm when times is provided (#47026)

* DOC: Fix some typos in pandas/. (#47022)

* remove two more casts

* avoid cast-like annotation

* left/right

* cannot use |

Co-authored-by: jbrockmendel <[email protected]>
Co-authored-by: Richard Shadrach <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Co-authored-by: Shuangchi He <[email protected]>
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
* TYP: NoDefault

* ix mypy issues; re-write isinstance(..., NoDefault)

* remove two more casts

* ENH: DatetimeArray fields support non-nano (pandas-dev#47044)

* DEPR: groupby numeric_only default (pandas-dev#47025)

* DOC: Clarify decay argument validation in ewm when times is provided (pandas-dev#47026)

* DOC: Fix some typos in pandas/. (pandas-dev#47022)

* remove two more casts

* avoid cast-like annotation

* left/right

* cannot use |

Co-authored-by: jbrockmendel <[email protected]>
Co-authored-by: Richard Shadrach <[email protected]>
Co-authored-by: Matthew Roeschke <[email protected]>
Co-authored-by: Shuangchi He <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Groupby Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: DataFrameGroupBy numeric_only defaulting to True
3 participants