Skip to content

Grouper object does not handle IndexVariable #7919

@mwtoews

Description

@mwtoews

What happened?

Since #7561 with xarray-2023.5.0, the new grouper object raises an unexpected exception with an IndexVariable.

What did you expect to happen?

With xarray-2023.3.0 there was no issue, the grouper operation returned a new DataArray object.

Minimal Complete Verifiable Example

import numpy as np
import pandas as pd
import xarray as xr

da = xr.DataArray(
    np.linspace(0, 1826, num=1827),
    coords=[pd.date_range("2000-01-01", "2004-12-31", freq="D")],
    dims="time",
)
iv = xr.IndexVariable(dims=("time",), data=pd.Index(da.time.dt.year))

# This is where the exception is raised
m = da.groupby(iv).mean()
print(m)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[1], line 13
     10 iv = xr.IndexVariable(dims=("time",), data=pd.Index(da.time.dt.year))
     12 # This is where the exception is raised
---> 13 m = da.groupby(iv).mean()
     14 print(m)

File /tmp/py310/lib/python3.10/site-packages/xarray/core/dataarray.py:6503, in DataArray.groupby(self, group, squeeze, restore_coord_dims)
   6495 from xarray.core.groupby import (
   6496     DataArrayGroupBy,
   6497     ResolvedUniqueGrouper,
   6498     UniqueGrouper,
   6499     _validate_groupby_squeeze,
   6500 )
   6502 _validate_groupby_squeeze(squeeze)
-> 6503 rgrouper = ResolvedUniqueGrouper(UniqueGrouper(), group, self)
   6504 return DataArrayGroupBy(
   6505     self,
   6506     (rgrouper,),
   6507     squeeze=squeeze,
   6508     restore_coord_dims=restore_coord_dims,
   6509 )

File <string>:6, in __init__(self, grouper, group, obj)

File /tmp/py310/lib/python3.10/site-packages/xarray/core/groupby.py:335, in ResolvedGrouper.__post_init__(self)
    334 def __post_init__(self) -> None:
--> 335     self.group: T_Group = _resolve_group(self.obj, self.group)
    337     (
    338         self.group1d,
    339         self.stacked_obj,
    340         self.stacked_dim,
    341         self.inserted_dims,
    342     ) = _ensure_1d(group=self.group, obj=self.obj)

File /tmp/py310/lib/python3.10/site-packages/xarray/core/groupby.py:640, in _resolve_group(obj, group)
    637     else:
    638         newgroup = group
--> 640 if newgroup.size == 0:
    641     raise ValueError(f"{newgroup.name} must not be empty")
    643 return newgroup

UnboundLocalError: local variable 'newgroup' referenced before assignment

Anything else we need to know?

With xarray-2023.3.0 the output for the example is:

<xarray.DataArray (time: 5)>
array([ 182.5,  548. ,  913. , 1278. , 1643.5])
Coordinates:
  * time     (time) int64 2000 2001 2002 2003 2004

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 4.4.0-19041-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2023.5.0
pandas: 1.5.3
numpy: 1.24.3
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.7.1
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 59.6.0
pip: 23.1.2
conda: None
pytest: 7.3.1
mypy: None
IPython: 8.13.2
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions