You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
The task is opening a dataset (e.g. a netcdf or zarr file) with a time coordinate using use_cftime=True. Delaying the task with dask results in the time coordinate being represented as cftime.datetime objects, whereas when the task is not delayed cftime.Datetime<Calendar> objects are used.
What you expected to happen:
Consistent cftime objects to be used, regardless of whether the opening task is delayed or not.
Minimal Complete Verifiable Example:
importdaskimportnumpyasnpimportxarrayasxrfromdask.distributedimportLocalCluster, Clientcluster=LocalCluster()
client=Client(cluster)
# Write some datavar=np.random.random(4)
time=xr.cftime_range('2000-01-01', periods=4, calendar='julian')
ds=xr.Dataset(data_vars={'var': ('time', var)},
coords={'time': time})
ds.to_netcdf('test.nc', mode='w')
# Open written datads1=xr.open_dataset('test.nc', use_cftime=True)
print(f'ds1: {ds1.time}\n')
# Delayed open written datads2=dask.delayed(xr.open_dataset)('test.nc', use_cftime=True)
ds2=dask.compute(ds2)[0]
print(f'ds2: {ds2.time}\n')
# Operations like xr.open_mfdataset which use dask.delayed internally # when parallel=True (I think) produce the same result as ds2ds3=xr.open_mfdataset('test.nc', use_cftime=True, parallel=True)
print(f'ds3: {ds3.time}')
Anything else we need to know?:
I noticed this because the DatetimeAccessor ceil, floor and round methods return errors for cftime.datetime objects (but not cftime.Datetime<Calendar> objects) for all calendar types other than 'gregorian'. For example,
ds3.time.dt.floor('D')
returns the following traceback:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-10-613e63624953> in <module>
----> 1 ds3.time.dt.floor('D')
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in floor(self, freq)
220 """
221
--> 222 return self._tslib_round_accessor("floor", freq)
223
224 def ceil(self, freq):
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _tslib_round_accessor(self, name, freq)
202 def _tslib_round_accessor(self, name, freq):
203 obj_type = type(self._obj)
--> 204 result = _round_field(self._obj.data, name, freq)
205 return obj_type(result, name=name, coords=self._obj.coords, dims=self._obj.dims)
206
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_field(values, name, freq)
142 )
143 else:
--> 144 return _round_through_series_or_index(values, name, freq)
145
146
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/accessor_dt.py in _round_through_series_or_index(values, name, freq)
110 method = getattr(values_as_cftimeindex, name)
111
--> 112 field_values = method(freq=freq).values
113
114 return field_values.reshape(values.shape)
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in floor(self, freq)
733 CFTimeIndex
734 """
--> 735 return self._round_via_method(freq, _floor_int)
736
737 def ceil(self, freq):
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in _round_via_method(self, freq, method)
714
715 unit = _total_microseconds(offset.as_timedelta())
--> 716 values = self.asi8
717 rounded = method(values, unit)
718 return _cftimeindex_from_i8(rounded, self.date_type, self.name)
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in asi8(self)
684 epoch = self.date_type(1970, 1, 1)
685 return np.array(
--> 686 [
687 _total_microseconds(exact_cftime_datetime_difference(epoch, date))
688 for date in self.values
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/coding/cftimeindex.py in <listcomp>(.0)
685 return np.array(
686 [
--> 687 _total_microseconds(exact_cftime_datetime_difference(epoch, date))
688 for date in self.values
689 ],
/g/data/xv83/ds0092/software/miniconda3/envs/pangeo/lib/python3.9/site-packages/xarray/core/resample_cftime.py in exact_cftime_datetime_difference(a, b)
356 datetime.timedelta
357 """
--> 358 seconds = b.replace(microsecond=0) - a.replace(microsecond=0)
359 seconds = int(round(seconds.total_seconds()))
360 microseconds = b.microsecond - a.microsecond
src/cftime/_cftime.pyx in cftime._cftime.datetime.__sub__()
TypeError: cannot compute the time difference between dates with different calendars
My apologies for conflating two issues here. I'm happy to open a separate issue for this if that's preferred.
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-305.19.1.el8.nci.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4
Sorry for not responding to this issue earlier -- I think this is related to #5686 (see discussion and links there for more details). I can reproduce your issue with cftime version 1.5.0, and I tested things with cftime version 1.5.1 and it was fixed (i.e. cftime.DatetimeJulian objects are returned in all cases).
What happened:
The task is opening a dataset (e.g. a netcdf or zarr file) with a time coordinate using
use_cftime=True
. Delaying the task with dask results in the time coordinate being represented ascftime.datetime
objects, whereas when the task is not delayedcftime.Datetime<Calendar>
objects are used.What you expected to happen:
Consistent
cftime
objects to be used, regardless of whether the opening task is delayed or not.Minimal Complete Verifiable Example:
returns
Anything else we need to know?:
I noticed this because the DatetimeAccessor
ceil
,floor
andround
methods return errors forcftime.datetime
objects (but notcftime.Datetime<Calendar>
objects) for all calendar types other than 'gregorian'. For example,returns the following traceback:
My apologies for conflating two issues here. I'm happy to open a separate issue for this if that's preferred.
Environment:
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-305.19.1.el8.nci.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.20.1
pandas: 1.3.4
numpy: 1.21.4
scipy: 1.6.3
netCDF4: 1.5.6
pydap: None
h5netcdf: 0.11.0
h5py: 3.3.0
Nio: None
zarr: 2.9.5
cftime: 1.5.0
nc_time_axis: 1.4.0
PseudoNetCDF: None
rasterio: 1.2.4
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.11.2
distributed: 2021.11.2
matplotlib: 3.4.2
cartopy: 0.19.0.post1
seaborn: None
numbagg: None
fsspec: 2021.05.0
cupy: None
pint: 0.18
sparse: None
setuptools: 49.6.0.post20210108
pip: 21.1.2
conda: 4.10.1
pytest: None
IPython: 7.24.0
sphinx: None
The text was updated successfully, but these errors were encountered: