Skip to content

Commit 4e41fc3

Browse files
author
dcherian
committed
Merge branch 'master' into yohai-ds_scatter
* master: remove xfail from test_cross_engine_read_write_netcdf4 (pydata#2741) Reenable cross engine read write netCDF test (pydata#2739) remove bottleneck dev build from travis, this test env was failing to build (pydata#2736) CFTimeIndex Resampling (pydata#2593) add tests for handling of empty pandas objects in constructors (pydata#2735) dropna() for a Series indexed by a CFTimeIndex (pydata#2734) deprecate compat & encoding (pydata#2703) Implement integrate (pydata#2653) ENH: resample methods with tolerance (pydata#2716) improve error message for invalid encoding (pydata#2730) silence a couple of warnings (pydata#2727)
2 parents 7392c81 + 27cf53f commit 4e41fc3

24 files changed

+899
-131
lines changed

.github/stale.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ staleLabel: stale
2828
# Comment to post when marking as stale. Set to `false` to disable
2929
markComment: |
3030
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
31-
If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically
31+
32+
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
3233
3334
# Comment to post when removing the stale label.
3435
# unmarkComment: >

.travis.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ matrix:
1919
- EXTRA_FLAGS="--run-flaky --run-network-tests"
2020
- env: CONDA_ENV=py36-dask-dev
2121
- env: CONDA_ENV=py36-pandas-dev
22-
- env: CONDA_ENV=py36-bottleneck-dev
2322
- env: CONDA_ENV=py36-rasterio
2423
- env: CONDA_ENV=py36-zarr-dev
2524
- env: CONDA_ENV=docs
@@ -31,7 +30,6 @@ matrix:
3130
- CONDA_ENV=py36
3231
- EXTRA_FLAGS="--run-flaky --run-network-tests"
3332
- env: CONDA_ENV=py36-pandas-dev
34-
- env: CONDA_ENV=py36-bottleneck-dev
3533
- env: CONDA_ENV=py36-zarr-dev
3634

3735
before_install:

ci/requirements-py36-bottleneck-dev.yml

Lines changed: 0 additions & 24 deletions
This file was deleted.

doc/api.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ Computation
152152
Dataset.diff
153153
Dataset.quantile
154154
Dataset.differentiate
155+
Dataset.integrate
155156

156157
**Aggregation**:
157158
:py:attr:`~Dataset.all`
@@ -321,6 +322,7 @@ Computation
321322
DataArray.dot
322323
DataArray.quantile
323324
DataArray.differentiate
325+
DataArray.integrate
324326

325327
**Aggregation**:
326328
:py:attr:`~DataArray.all`

doc/computation.rst

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,8 @@ function or method name to ``coord_func`` option,
240240
da.coarsen(time=7, x=2, coord_func={'time': 'min'}).mean()
241241
242242
243+
.. _compute.using_coordinates:
244+
243245
Computation using Coordinates
244246
=============================
245247

@@ -261,9 +263,17 @@ This method can be used also for multidimensional arrays,
261263
coords={'x': [0.1, 0.11, 0.2, 0.3]})
262264
a.differentiate('x')
263265
266+
:py:meth:`~xarray.DataArray.integrate` computes integration based on
267+
trapezoidal rule using their coordinates,
268+
269+
.. ipython:: python
270+
271+
a.integrate('x')
272+
264273
.. note::
265-
This method is limited to simple cartesian geometry. Differentiation along
266-
multidimensional coordinate is not supported.
274+
These methods are limited to simple cartesian geometry. Differentiation
275+
and integration along multidimensional coordinate are not supported.
276+
267277

268278
.. _compute.broadcasting:
269279

doc/time-series.rst

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -196,11 +196,20 @@ resampling group:
196196
197197
ds.resample(time='6H').reduce(np.mean)
198198
199-
For upsampling, xarray provides four methods: ``asfreq``, ``ffill``, ``bfill``,
200-
and ``interpolate``. ``interpolate`` extends ``scipy.interpolate.interp1d`` and
201-
supports all of its schemes. All of these resampling operations work on both
199+
For upsampling, xarray provides six methods: ``asfreq``, ``ffill``, ``bfill``, ``pad``,
200+
``nearest`` and ``interpolate``. ``interpolate`` extends ``scipy.interpolate.interp1d``
201+
and supports all of its schemes. All of these resampling operations work on both
202202
Dataset and DataArray objects with an arbitrary number of dimensions.
203203

204+
In order to limit the scope of the methods ``ffill``, ``bfill``, ``pad`` and
205+
``nearest`` the ``tolerance`` argument can be set in coordinate units.
206+
Data that has indices outside of the given ``tolerance`` are set to ``NaN``.
207+
208+
.. ipython:: python
209+
210+
ds.resample(time='1H').nearest(tolerance='1H')
211+
212+
204213
For more examples of using grouped operations on a time dimension, see
205214
:ref:`toy weather data`.
206215

@@ -300,31 +309,34 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:
300309
301310
da.differentiate('time')
302311
303-
- And serialization:
312+
- Serialization:
304313

305314
.. ipython:: python
306315
307316
da.to_netcdf('example-no-leap.nc')
308317
xr.open_dataset('example-no-leap.nc')
309318
319+
- And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`:
320+
321+
.. ipython:: python
322+
323+
da.resample(time='81T', closed='right', label='right', base=3).mean()
324+
310325
.. note::
311326

312327
While much of the time series functionality that is possible for standard
313328
dates has been implemented for dates from non-standard calendars, there are
314329
still some remaining important features that have yet to be implemented,
315330
for example:
316331

317-
- Resampling along the time dimension for data indexed by a
318-
:py:class:`~xarray.CFTimeIndex` (:issue:`2191`, :issue:`2458`)
319332
- Built-in plotting of data with :py:class:`cftime.datetime` coordinate axes
320333
(:issue:`2164`).
321334

322335
For some use-cases it may still be useful to convert from
323336
a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`,
324-
despite the difference in calendar types (e.g. to allow the use of some
325-
forms of resample with non-standard calendars). The recommended way of
326-
doing this is to use the built-in
327-
:py:meth:`~xarray.CFTimeIndex.to_datetimeindex` method:
337+
despite the difference in calendar types. The recommended way of doing this
338+
is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex`
339+
method:
328340

329341
.. ipython:: python
330342
:okwarning:
@@ -334,8 +346,7 @@ For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports:
334346
da
335347
datetimeindex = da.indexes['time'].to_datetimeindex()
336348
da['time'] = datetimeindex
337-
da.resample(time='Y').mean('time')
338-
349+
339350
However in this case one should use caution to only perform operations which
340351
do not depend on differences between dates (e.g. differentiation,
341352
interpolation, or upsampling with resample), as these could introduce subtle

doc/whats-new.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ Breaking changes
2424
- Remove support for Python 2. This is the first version of xarray that is
2525
Python 3 only. (:issue:`1876`).
2626
By `Joe Hamman <https://github.com/jhamman>`_.
27+
- The `compat` argument to `Dataset` and the `encoding` argument to
28+
`DataArray` are deprecated and will be removed in a future release.
29+
(:issue:`1188`)
30+
By `Maximilian Roos <https://github.com/max-sixty>`_.
2731

2832
Enhancements
2933
~~~~~~~~~~~~
@@ -45,6 +49,21 @@ Enhancements
4549
By `Benoit Bovy <https://github.com/benbovy>`_.
4650
- Dataset plotting API! Currently only :py:meth:`Dataset.plot.scatter` is implemented.
4751
By `Yohai Bar Sinai <https://github.com/yohai>`_ and `Deepak Cherian <https://github.com/dcherian>`_
52+
- Resampling of standard and non-standard calendars indexed by
53+
:py:class:`~xarray.CFTimeIndex` is now possible. (:issue:`2191`).
54+
By `Jwen Fai Low <https://github.com/jwenfai>`_ and
55+
`Spencer Clark <https://github.com/spencerkclark>`_.
56+
- Add ``tolerance`` option to ``resample()`` methods ``bfill``, ``pad``,
57+
``nearest``. (:issue:`2695`)
58+
By `Hauke Schulz <https://github.com/observingClouds>`_.
59+
- :py:meth:`~xarray.DataArray.integrate` and
60+
:py:meth:`~xarray.Dataset.integrate` are newly added.
61+
See :ref:`_compute.using_coordinates` for the detail.
62+
(:issue:`1332`)
63+
By `Keisuke Fujii <https://github.com/fujiisoup>`_.
64+
- :py:meth:`pandas.Series.dropna` is now supported for a
65+
:py:class:`pandas.Series` indexed by a :py:class:`~xarray.CFTimeIndex`
66+
(:issue:`2688`). By `Spencer Clark <https://github.com/spencerkclark>`_.
4867

4968
Bug fixes
5069
~~~~~~~~~
@@ -114,6 +133,7 @@ Breaking changes
114133
(:issue:`2565`). The previous behavior was to decode them only if they
115134
had specific time attributes, now these attributes are copied
116135
automatically from the corresponding time coordinate. This might
136+
break downstream code that was relying on these variables to be
117137
brake downstream code that was relying on these variables to be
118138
not decoded.
119139
By `Fabien Maussion <https://github.com/fmaussion>`_.

xarray/backends/netCDF4_.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -217,8 +217,9 @@ def _extract_nc4_variable_encoding(variable, raise_on_invalid=False,
217217
if raise_on_invalid:
218218
invalid = [k for k in encoding if k not in valid_encodings]
219219
if invalid:
220-
raise ValueError('unexpected encoding parameters for %r backend: '
221-
' %r' % (backend, invalid))
220+
raise ValueError(
221+
'unexpected encoding parameters for %r backend: %r. Valid '
222+
'encodings are: %r' % (backend, invalid, valid_encodings))
222223
else:
223224
for k in list(encoding):
224225
if k not in valid_encodings:

xarray/coding/cftime_offsets.py

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -358,29 +358,41 @@ def rollback(self, date):
358358
class Day(BaseCFTimeOffset):
359359
_freq = 'D'
360360

361+
def as_timedelta(self):
362+
return timedelta(days=self.n)
363+
361364
def __apply__(self, other):
362-
return other + timedelta(days=self.n)
365+
return other + self.as_timedelta()
363366

364367

365368
class Hour(BaseCFTimeOffset):
366369
_freq = 'H'
367370

371+
def as_timedelta(self):
372+
return timedelta(hours=self.n)
373+
368374
def __apply__(self, other):
369-
return other + timedelta(hours=self.n)
375+
return other + self.as_timedelta()
370376

371377

372378
class Minute(BaseCFTimeOffset):
373379
_freq = 'T'
374380

381+
def as_timedelta(self):
382+
return timedelta(minutes=self.n)
383+
375384
def __apply__(self, other):
376-
return other + timedelta(minutes=self.n)
385+
return other + self.as_timedelta()
377386

378387

379388
class Second(BaseCFTimeOffset):
380389
_freq = 'S'
381390

391+
def as_timedelta(self):
392+
return timedelta(seconds=self.n)
393+
382394
def __apply__(self, other):
383-
return other + timedelta(seconds=self.n)
395+
return other + self.as_timedelta()
384396

385397

386398
_FREQUENCIES = {
@@ -427,6 +439,11 @@ def __apply__(self, other):
427439
_FREQUENCY_CONDITION)
428440

429441

442+
# pandas defines these offsets as "Tick" objects, which for instance have
443+
# distinct behavior from monthly or longer frequencies in resample.
444+
CFTIME_TICKS = (Day, Hour, Minute, Second)
445+
446+
430447
def to_offset(freq):
431448
"""Convert a frequency string to the appropriate subclass of
432449
BaseCFTimeOffset."""

xarray/coding/cftimeindex.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -335,11 +335,13 @@ def _maybe_cast_slice_bound(self, label, side, kind):
335335
# e.g. series[1:5].
336336
def get_value(self, series, key):
337337
"""Adapted from pandas.tseries.index.DatetimeIndex.get_value"""
338-
if not isinstance(key, slice):
339-
return series.iloc[self.get_loc(key)]
340-
else:
338+
if np.asarray(key).dtype == np.dtype(bool):
339+
return series.iloc[key]
340+
elif isinstance(key, slice):
341341
return series.iloc[self.slice_indexer(
342342
key.start, key.stop, key.step)]
343+
else:
344+
return series.iloc[self.get_loc(key)]
343345

344346
def __contains__(self, key):
345347
"""Adapted from

xarray/core/alignment.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,7 +495,7 @@ def _broadcast_array(array):
495495
coords = OrderedDict(array.coords)
496496
coords.update(common_coords)
497497
return DataArray(data, coords, data.dims, name=array.name,
498-
attrs=array.attrs, encoding=array.encoding)
498+
attrs=array.attrs)
499499

500500
def _broadcast_dataset(ds):
501501
data_vars = OrderedDict(

xarray/core/common.py

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -713,6 +713,13 @@ def resample(self, indexer=None, skipna=None, closed=None, label=None,
713713
array([ 0. , 0.032258, 0.064516, ..., 10.935484, 10.967742, 11. ])
714714
Coordinates:
715715
* time (time) datetime64[ns] 1999-12-15 1999-12-16 1999-12-17 ...
716+
717+
Limit scope of upsampling method
718+
>>> da.resample(time='1D').nearest(tolerance='1D')
719+
<xarray.DataArray (time: 337)>
720+
array([ 0., 0., nan, ..., nan, 11., 11.])
721+
Coordinates:
722+
* time (time) datetime64[ns] 1999-12-15 1999-12-16 ... 2000-11-15
716723
717724
References
718725
----------
@@ -749,23 +756,16 @@ def resample(self, indexer=None, skipna=None, closed=None, label=None,
749756
dim_coord = self[dim]
750757

751758
if isinstance(self.indexes[dim_name], CFTimeIndex):
752-
raise NotImplementedError(
753-
'Resample is currently not supported along a dimension '
754-
'indexed by a CFTimeIndex. For certain kinds of downsampling '
755-
'it may be possible to work around this by converting your '
756-
'time index to a DatetimeIndex using '
757-
'CFTimeIndex.to_datetimeindex. Use caution when doing this '
758-
'however, because switching to a DatetimeIndex from a '
759-
'CFTimeIndex with a non-standard calendar entails a change '
760-
'in the calendar type, which could lead to subtle and silent '
761-
'errors.'
762-
)
763-
759+
from .resample_cftime import CFTimeGrouper
760+
grouper = CFTimeGrouper(freq, closed, label, base, loffset)
761+
else:
762+
# TODO: to_offset() call required for pandas==0.19.2
763+
grouper = pd.Grouper(freq=freq, closed=closed, label=label,
764+
base=base,
765+
loffset=pd.tseries.frequencies.to_offset(
766+
loffset))
764767
group = DataArray(dim_coord, coords=dim_coord.coords,
765768
dims=dim_coord.dims, name=RESAMPLE_DIM)
766-
# TODO: to_offset() call required for pandas==0.19.2
767-
grouper = pd.Grouper(freq=freq, closed=closed, label=label, base=base,
768-
loffset=pd.tseries.frequencies.to_offset(loffset))
769769
resampler = self._resample_cls(self, group=group, dim=dim_name,
770770
grouper=grouper,
771771
resample_dim=RESAMPLE_DIM)

0 commit comments

Comments
 (0)