-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
Since #7561 with xarray-2023.5.0, the new grouper object raises an unexpected exception with an IndexVariable.
What did you expect to happen?
With xarray-2023.3.0 there was no issue, the grouper operation returned a new DataArray object.
Minimal Complete Verifiable Example
import numpy as np
import pandas as pd
import xarray as xr
da = xr.DataArray(
np.linspace(0, 1826, num=1827),
coords=[pd.date_range("2000-01-01", "2004-12-31", freq="D")],
dims="time",
)
iv = xr.IndexVariable(dims=("time",), data=pd.Index(da.time.dt.year))
# This is where the exception is raised
m = da.groupby(iv).mean()
print(m)
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In[1], line 13
10 iv = xr.IndexVariable(dims=("time",), data=pd.Index(da.time.dt.year))
12 # This is where the exception is raised
---> 13 m = da.groupby(iv).mean()
14 print(m)
File /tmp/py310/lib/python3.10/site-packages/xarray/core/dataarray.py:6503, in DataArray.groupby(self, group, squeeze, restore_coord_dims)
6495 from xarray.core.groupby import (
6496 DataArrayGroupBy,
6497 ResolvedUniqueGrouper,
6498 UniqueGrouper,
6499 _validate_groupby_squeeze,
6500 )
6502 _validate_groupby_squeeze(squeeze)
-> 6503 rgrouper = ResolvedUniqueGrouper(UniqueGrouper(), group, self)
6504 return DataArrayGroupBy(
6505 self,
6506 (rgrouper,),
6507 squeeze=squeeze,
6508 restore_coord_dims=restore_coord_dims,
6509 )
File <string>:6, in __init__(self, grouper, group, obj)
File /tmp/py310/lib/python3.10/site-packages/xarray/core/groupby.py:335, in ResolvedGrouper.__post_init__(self)
334 def __post_init__(self) -> None:
--> 335 self.group: T_Group = _resolve_group(self.obj, self.group)
337 (
338 self.group1d,
339 self.stacked_obj,
340 self.stacked_dim,
341 self.inserted_dims,
342 ) = _ensure_1d(group=self.group, obj=self.obj)
File /tmp/py310/lib/python3.10/site-packages/xarray/core/groupby.py:640, in _resolve_group(obj, group)
637 else:
638 newgroup = group
--> 640 if newgroup.size == 0:
641 raise ValueError(f"{newgroup.name} must not be empty")
643 return newgroup
UnboundLocalError: local variable 'newgroup' referenced before assignment
Anything else we need to know?
With xarray-2023.3.0 the output for the example is:
<xarray.DataArray (time: 5)>
array([ 182.5, 548. , 913. , 1278. , 1643.5])
Coordinates:
* time (time) int64 2000 2001 2002 2003 2004
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 4.4.0-19041-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None
xarray: 2023.5.0
pandas: 1.5.3
numpy: 1.24.3
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.7.1
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 59.6.0
pip: 23.1.2
conda: None
pytest: 7.3.1
mypy: None
IPython: 8.13.2
sphinx: None