Skip to content

Commit 6b4254e

Browse files
authored
DEPR: Replacing builtin and NumPy funcs in agg/apply/transform (#53974)
* DEPR: Replacing builtin and NumPy funcs in agg/apply/transform * mypy fixup
1 parent 431dd6f commit 6b4254e

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+510
-272
lines changed

doc/source/getting_started/comparison/comparison_with_r.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ In pandas we may use :meth:`~pandas.pivot_table` method to handle this:
246246
}
247247
)
248248
249-
baseball.pivot_table(values="batting avg", columns="team", aggfunc=np.max)
249+
baseball.pivot_table(values="batting avg", columns="team", aggfunc="max")
250250
251251
For more details and examples see :ref:`the reshaping documentation
252252
<reshaping.pivot>`.
@@ -359,7 +359,7 @@ In pandas the equivalent expression, using the
359359
)
360360
361361
grouped = df.groupby(["month", "week"])
362-
grouped["x"].agg([np.mean, np.std])
362+
grouped["x"].agg(["mean", "std"])
363363
364364
365365
For more details and examples see :ref:`the groupby documentation
@@ -482,7 +482,7 @@ In Python the best way is to make use of :meth:`~pandas.pivot_table`:
482482
values="value",
483483
index=["variable", "week"],
484484
columns=["month"],
485-
aggfunc=np.mean,
485+
aggfunc="mean",
486486
)
487487
488488
Similarly for ``dcast`` which uses a data.frame called ``df`` in R to

doc/source/getting_started/comparison/comparison_with_sql.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,7 @@ to your grouped DataFrame, indicating which functions to apply to specific colum
198198
199199
.. ipython:: python
200200
201-
tips.groupby("day").agg({"tip": np.mean, "day": np.size})
201+
tips.groupby("day").agg({"tip": "mean", "day": "size"})
202202
203203
Grouping by more than one column is done by passing a list of columns to the
204204
:meth:`~pandas.DataFrame.groupby` method.
@@ -222,7 +222,7 @@ Grouping by more than one column is done by passing a list of columns to the
222222
223223
.. ipython:: python
224224
225-
tips.groupby(["smoker", "day"]).agg({"tip": [np.size, np.mean]})
225+
tips.groupby(["smoker", "day"]).agg({"tip": ["size", "mean"]})
226226
227227
.. _compare_with_sql.join:
228228

doc/source/user_guide/basics.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -881,8 +881,8 @@ statistics methods, takes an optional ``axis`` argument:
881881

882882
.. ipython:: python
883883
884-
df.apply(np.mean)
885-
df.apply(np.mean, axis=1)
884+
df.apply(lambda x: np.mean(x))
885+
df.apply(lambda x: np.mean(x), axis=1)
886886
df.apply(lambda x: x.max() - x.min())
887887
df.apply(np.cumsum)
888888
df.apply(np.exp)
@@ -986,7 +986,7 @@ output:
986986

987987
.. ipython:: python
988988
989-
tsdf.agg(np.sum)
989+
tsdf.agg(lambda x: np.sum(x))
990990
991991
tsdf.agg("sum")
992992

doc/source/user_guide/cookbook.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -530,7 +530,7 @@ Unlike agg, apply's callable is passed a sub-DataFrame which gives you access to
530530
531531
code_groups = df.groupby("code")
532532
533-
agg_n_sort_order = code_groups[["data"]].transform(sum).sort_values(by="data")
533+
agg_n_sort_order = code_groups[["data"]].transform("sum").sort_values(by="data")
534534
535535
sorted_df = df.loc[agg_n_sort_order.index]
536536
@@ -549,7 +549,7 @@ Unlike agg, apply's callable is passed a sub-DataFrame which gives you access to
549549
return x.iloc[1] * 1.234
550550
return pd.NaT
551551
552-
mhc = {"Mean": np.mean, "Max": np.max, "Custom": MyCust}
552+
mhc = {"Mean": "mean", "Max": "max", "Custom": MyCust}
553553
ts.resample("5min").apply(mhc)
554554
ts
555555
@@ -685,7 +685,7 @@ The :ref:`Pivot <reshaping.pivot>` docs.
685685
values=["Sales"],
686686
index=["Province"],
687687
columns=["City"],
688-
aggfunc=np.sum,
688+
aggfunc="sum",
689689
margins=True,
690690
)
691691
table.stack("City")

doc/source/user_guide/reshaping.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -402,12 +402,12 @@ We can produce pivot tables from this data very easily:
402402
.. ipython:: python
403403
404404
pd.pivot_table(df, values="D", index=["A", "B"], columns=["C"])
405-
pd.pivot_table(df, values="D", index=["B"], columns=["A", "C"], aggfunc=np.sum)
405+
pd.pivot_table(df, values="D", index=["B"], columns=["A", "C"], aggfunc="sum")
406406
pd.pivot_table(
407407
df, values=["D", "E"],
408408
index=["B"],
409409
columns=["A", "C"],
410-
aggfunc=np.sum,
410+
aggfunc="sum",
411411
)
412412
413413
The result object is a :class:`DataFrame` having potentially hierarchical indexes on the
@@ -451,7 +451,7 @@ rows and columns:
451451
columns="C",
452452
values=["D", "E"],
453453
margins=True,
454-
aggfunc=np.std
454+
aggfunc="std"
455455
)
456456
table
457457
@@ -552,7 +552,7 @@ each group defined by the first two :class:`Series`:
552552

553553
.. ipython:: python
554554
555-
pd.crosstab(df["A"], df["B"], values=df["C"], aggfunc=np.sum)
555+
pd.crosstab(df["A"], df["B"], values=df["C"], aggfunc="sum")
556556
557557
Adding margins
558558
~~~~~~~~~~~~~~
@@ -562,7 +562,7 @@ Finally, one can also add margins or normalize this output.
562562
.. ipython:: python
563563
564564
pd.crosstab(
565-
df["A"], df["B"], values=df["C"], aggfunc=np.sum, normalize=True, margins=True
565+
df["A"], df["B"], values=df["C"], aggfunc="sum", normalize=True, margins=True
566566
)
567567
568568
.. _reshaping.tile:

doc/source/user_guide/timeseries.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1801,22 +1801,22 @@ You can pass a list or dict of functions to do aggregation with, outputting a ``
18011801

18021802
.. ipython:: python
18031803
1804-
r["A"].agg([np.sum, np.mean, np.std])
1804+
r["A"].agg(["sum", "mean", "std"])
18051805
18061806
On a resampled ``DataFrame``, you can pass a list of functions to apply to each
18071807
column, which produces an aggregated result with a hierarchical index:
18081808

18091809
.. ipython:: python
18101810
1811-
r.agg([np.sum, np.mean])
1811+
r.agg(["sum", "mean"])
18121812
18131813
By passing a dict to ``aggregate`` you can apply a different aggregation to the
18141814
columns of a ``DataFrame``:
18151815

18161816
.. ipython:: python
18171817
:okexcept:
18181818
1819-
r.agg({"A": np.sum, "B": lambda x: np.std(x, ddof=1)})
1819+
r.agg({"A": "sum", "B": lambda x: np.std(x, ddof=1)})
18201820
18211821
The function names can also be strings. In order for a string to be valid it
18221822
must be implemented on the resampled object:

doc/source/user_guide/window.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,7 +140,7 @@ of multiple aggregations applied to a window.
140140
.. ipython:: python
141141
142142
df = pd.DataFrame({"A": range(5), "B": range(10, 15)})
143-
df.expanding().agg([np.sum, np.mean, np.std])
143+
df.expanding().agg(["sum", "mean", "std"])
144144
145145
146146
.. _window.generic:

doc/source/whatsnew/v0.14.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -846,7 +846,7 @@ Enhancements
846846
df.pivot_table(values='Quantity',
847847
index=pd.Grouper(freq='M', key='Date'),
848848
columns=pd.Grouper(freq='M', key='PayDay'),
849-
aggfunc=np.sum)
849+
aggfunc="sum")
850850
851851
- Arrays of strings can be wrapped to a specified width (``str.wrap``) (:issue:`6999`)
852852
- Add :meth:`~Series.nsmallest` and :meth:`Series.nlargest` methods to Series, See :ref:`the docs <basics.nsorted>` (:issue:`3960`)

doc/source/whatsnew/v0.20.0.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -984,7 +984,7 @@ Previous behavior:
984984
75% 3.750000
985985
max 4.000000
986986
987-
In [3]: df.groupby('A').agg([np.mean, np.std, np.min, np.max])
987+
In [3]: df.groupby('A').agg(["mean", "std", "min", "max"])
988988
Out[3]:
989989
B
990990
mean std amin amax
@@ -1000,7 +1000,7 @@ New behavior:
10001000
10011001
df.groupby('A').describe()
10021002
1003-
df.groupby('A').agg([np.mean, np.std, np.min, np.max])
1003+
df.groupby('A').agg(["mean", "std", "min", "max"])
10041004
10051005
.. _whatsnew_0200.api_breaking.rolling_pairwise:
10061006

@@ -1163,7 +1163,7 @@ Previous behavior:
11631163

11641164
.. code-block:: ipython
11651165
1166-
In [2]: df.pivot_table('col1', index=['col3', 'col2'], aggfunc=np.sum)
1166+
In [2]: df.pivot_table('col1', index=['col3', 'col2'], aggfunc="sum")
11671167
Out[2]:
11681168
col3 col2
11691169
1 C 3
@@ -1175,7 +1175,7 @@ New behavior:
11751175

11761176
.. ipython:: python
11771177
1178-
df.pivot_table('col1', index=['col3', 'col2'], aggfunc=np.sum)
1178+
df.pivot_table('col1', index=['col3', 'col2'], aggfunc="sum")
11791179
11801180
.. _whatsnew_0200.api:
11811181

doc/source/whatsnew/v0.25.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ output columns when applying multiple aggregation functions to specific columns
4848
animals.groupby("kind").agg(
4949
min_height=pd.NamedAgg(column='height', aggfunc='min'),
5050
max_height=pd.NamedAgg(column='height', aggfunc='max'),
51-
average_weight=pd.NamedAgg(column='weight', aggfunc=np.mean),
51+
average_weight=pd.NamedAgg(column='weight', aggfunc="mean"),
5252
)
5353
5454
Pass the desired columns names as the ``**kwargs`` to ``.agg``. The values of ``**kwargs``
@@ -61,7 +61,7 @@ what the arguments to the function are, but plain tuples are accepted as well.
6161
animals.groupby("kind").agg(
6262
min_height=('height', 'min'),
6363
max_height=('height', 'max'),
64-
average_weight=('weight', np.mean),
64+
average_weight=('weight', 'mean'),
6565
)
6666
6767
Named aggregation is the recommended replacement for the deprecated "dict-of-dicts"

0 commit comments

Comments
 (0)