Skip to content

Commit 384818f

Browse files
committed
Add docs
1 parent f3c50d1 commit 384818f

File tree

6 files changed

+193
-46
lines changed

6 files changed

+193
-46
lines changed

doc/api-hidden.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -693,3 +693,7 @@
693693

694694
coding.times.CFTimedeltaCoder
695695
coding.times.CFDatetimeCoder
696+
697+
core.groupers.Grouper
698+
core.groupers.Resampler
699+
core.groupers.EncodedGroups

doc/api.rst

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -801,6 +801,18 @@ DataArray
801801
DataArrayGroupBy.dims
802802
DataArrayGroupBy.groups
803803

804+
Grouper Objects
805+
---------------
806+
807+
.. currentmodule:: xarray.core
808+
809+
.. autosummary::
810+
:toctree: generated/
811+
812+
groupers.BinGrouper
813+
groupers.UniqueGrouper
814+
groupers.TimeResampler
815+
804816

805817
Rolling objects
806818
===============
@@ -1026,17 +1038,20 @@ DataArray
10261038
Accessors
10271039
=========
10281040

1029-
.. currentmodule:: xarray
1041+
.. currentmodule:: xarray.core
10301042

10311043
.. autosummary::
10321044
:toctree: generated/
10331045

1034-
core.accessor_dt.DatetimeAccessor
1035-
core.accessor_dt.TimedeltaAccessor
1036-
core.accessor_str.StringAccessor
1046+
accessor_dt.DatetimeAccessor
1047+
accessor_dt.TimedeltaAccessor
1048+
accessor_str.StringAccessor
1049+
10371050

10381051
Custom Indexes
10391052
==============
1053+
.. currentmodule:: xarray
1054+
10401055
.. autosummary::
10411056
:toctree: generated/
10421057

doc/conf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,7 @@
166166
"CategoricalIndex": "~pandas.CategoricalIndex",
167167
"TimedeltaIndex": "~pandas.TimedeltaIndex",
168168
"DatetimeIndex": "~pandas.DatetimeIndex",
169+
"IntervalIndex": "~pandas.IntervalIndex",
169170
"Series": "~pandas.Series",
170171
"DataFrame": "~pandas.DataFrame",
171172
"Categorical": "~pandas.Categorical",

doc/user-guide/groupby.rst

Lines changed: 76 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. currentmodule:: xarray
2+
13
.. _groupby:
24

35
GroupBy: Group and Bin Data
@@ -15,19 +17,20 @@ __ https://www.jstatsoft.org/v40/i01/paper
1517
- Apply some function to each group.
1618
- Combine your groups back into a single data object.
1719

18-
Group by operations work on both :py:class:`~xarray.Dataset` and
19-
:py:class:`~xarray.DataArray` objects. Most of the examples focus on grouping by
20+
Group by operations work on both :py:class:`Dataset` and
21+
:py:class:`DataArray` objects. Most of the examples focus on grouping by
2022
a single one-dimensional variable, although support for grouping
2123
over a multi-dimensional variable has recently been implemented. Note that for
2224
one-dimensional data, it is usually faster to rely on pandas' implementation of
2325
the same pipeline.
2426

2527
.. tip::
2628

27-
To substantially improve the performance of GroupBy operations, particularly
28-
with dask `install the flox package <https://flox.readthedocs.io>`_. flox
29+
`Install the flox package <https://flox.readthedocs.io>`_ to substantially improve the performance
30+
of GroupBy operations, particularly with dask. flox
2931
`extends Xarray's in-built GroupBy capabilities <https://flox.readthedocs.io/en/latest/xarray.html>`_
30-
by allowing grouping by multiple variables, and lazy grouping by dask arrays. If installed, Xarray will automatically use flox by default.
32+
by allowing grouping by multiple variables, and lazy grouping by dask arrays.
33+
If installed, Xarray will automatically use flox by default.
3134

3235
Split
3336
~~~~~
@@ -87,7 +90,7 @@ Binning
8790
Sometimes you don't want to use all the unique values to determine the groups
8891
but instead want to "bin" the data into coarser groups. You could always create
8992
a customized coordinate, but xarray facilitates this via the
90-
:py:meth:`~xarray.Dataset.groupby_bins` method.
93+
:py:meth:`Dataset.groupby_bins` method.
9194

9295
.. ipython:: python
9396
@@ -110,7 +113,7 @@ Apply
110113
~~~~~
111114

112115
To apply a function to each group, you can use the flexible
113-
:py:meth:`~xarray.core.groupby.DatasetGroupBy.map` method. The resulting objects are automatically
116+
:py:meth:`core.groupby.DatasetGroupBy.map` method. The resulting objects are automatically
114117
concatenated back together along the group axis:
115118

116119
.. ipython:: python
@@ -121,8 +124,8 @@ concatenated back together along the group axis:
121124
122125
arr.groupby("letters").map(standardize)
123126
124-
GroupBy objects also have a :py:meth:`~xarray.core.groupby.DatasetGroupBy.reduce` method and
125-
methods like :py:meth:`~xarray.core.groupby.DatasetGroupBy.mean` as shortcuts for applying an
127+
GroupBy objects also have a :py:meth:`core.groupby.DatasetGroupBy.reduce` method and
128+
methods like :py:meth:`core.groupby.DatasetGroupBy.mean` as shortcuts for applying an
126129
aggregation function:
127130

128131
.. ipython:: python
@@ -183,7 +186,7 @@ Iterating and Squeezing
183186
Previously, Xarray defaulted to squeezing out dimensions of size one when iterating over
184187
a GroupBy object. This behaviour is being removed.
185188
You can always squeeze explicitly later with the Dataset or DataArray
186-
:py:meth:`~xarray.DataArray.squeeze` methods.
189+
:py:meth:`DataArray.squeeze` methods.
187190

188191
.. ipython:: python
189192
@@ -217,7 +220,7 @@ __ https://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_two_dime
217220
da.groupby("lon").map(lambda x: x - x.mean(), shortcut=False)
218221
219222
Because multidimensional groups have the ability to generate a very large
220-
number of bins, coarse-binning via :py:meth:`~xarray.Dataset.groupby_bins`
223+
number of bins, coarse-binning via :py:meth:`Dataset.groupby_bins`
221224
may be desirable:
222225

223226
.. ipython:: python
@@ -232,3 +235,65 @@ applying your function, and then unstacking the result:
232235
233236
stacked = da.stack(gridcell=["ny", "nx"])
234237
stacked.groupby("gridcell").sum(...).unstack("gridcell")
238+
239+
.. _groupby.groupers:
240+
241+
Grouper Objects
242+
~~~~~~~~~~~~~~~
243+
244+
Both ``groupby_bins`` and ``resample`` are specializations of the core ``groupby`` operation for binning,
245+
and time resampling. Many problems demand more complex GroupBy application: for example, grouping by multiple
246+
variables with a combination of categorical grouping, binning, and resampling; or more specializations like
247+
spatial resampling; or more complex time grouping like special handling of seasons, or the ability to specify
248+
custom seasons. To handle these use-cases and more, Xarray is evolving to providing an
249+
extension point using ``Grouper`` objects.
250+
251+
.. tip::
252+
253+
See the `grouper design`_ doc for more detail on the motivation and design ideas behind
254+
Grouper objects.
255+
256+
.. _grouper design: https://github.com/pydata/xarray/blob/main/design_notes/grouper_objects.md
257+
258+
For now Xarray provides three specialized Grouper objects:
259+
260+
1. :py:class:`groupers.UniqueGrouper` for categorical grouping
261+
2. :py:class:`groupers.BinGrouper` for binned grouping
262+
3. :py:class:`groupers.TimeResampler` for resampling along a datetime coordinate
263+
264+
These provide functionality identical to the existing ``groupby``, ``groupby_bins``, and ``resample`` methods.
265+
That is,
266+
.. codeblock:: python
267+
268+
ds.groupby("x")
269+
270+
is identical to
271+
.. codeblock:: python
272+
273+
from xarray.groupers import UniqueGrouper
274+
275+
ds.groupby(x=UniqueGrouper())
276+
277+
; and
278+
.. codeblock:: python
279+
280+
ds.groupby_bins("x", bins=bins)
281+
282+
is identical to
283+
.. codeblock:: python
284+
285+
from xarray.groupers import BinGrouper
286+
287+
ds.groupby(x=BinGrouper(bins))
288+
289+
; and
290+
.. codeblock:: python
291+
292+
ds.resample(time="ME")
293+
294+
is identical to
295+
.. codeblock:: python
296+
297+
from xarray.groupers import TimeResampler
298+
299+
ds.resample(time=TimeResampler("ME"))

xarray/__init__.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
# `mypy --strict` running in projects that import xarray.
5757
__all__ = (
5858
# Sub-packages
59+
"groupers",
5960
"testing",
6061
"tutorial",
6162
# Top-level functions
@@ -95,8 +96,6 @@
9596
"unify_chunks",
9697
"where",
9798
"zeros_like",
98-
# Submodules
99-
"groupers",
10099
# Classes
101100
"CFTimeIndex",
102101
"Context",

0 commit comments

Comments
 (0)