Skip to content

Numpy raises warning in xarray.coding.times.cast_to_int_if_safe #7942

@mx-moth

Description

@mx-moth

What happened?

In recent versions of numpy, calling numpy.asarray(arr, dtype=numpy.int64) will raise a warning if the input array contains numpy.nan values. This line of code is used in xarray.coding.times.cast_to_int_if_safe(num):

def cast_to_int_if_safe(num) -> np.ndarray:
    int_num = np.asarray(num, dtype=np.int64)
    if (num == int_num).all():
        num = int_num
    return num

The function still returns the correct True/False values regardless of the warning.

What did you expect to happen?

No warning to be printed

Minimal Complete Verifiable Example

import numpy
import xarray

one_day = numpy.timedelta64(1, 'D')
nat = numpy.timedelta64('nat')

timedelta_values = (numpy.arange(5) * one_day).astype('timedelta64[ns]')
timedelta_values[2] = nat
timedelta_values[4] = nat

dataset = xarray.Dataset(data_vars={
    'timedeltas': xarray.DataArray(data=timedelta_values, dims=['x'])
})
dataset.to_netcdf('out.nc')

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

$ python3 safe_cast.py
/home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/xarray/coding/times.py:618: RuntimeWarning: invalid value encountered in cast
  int_num = np.asarray(num, dtype=np.int64)

$ ncdump out.nc
netcdf out {
dimensions:
        x = 5 ;
variables:
        double timedeltas(x) ;
                timedeltas:_FillValue = NaN ;
                timedeltas:units = "days" ;
data:

 timedeltas = 0, 1, _, 3, _ ;
}

Anything else we need to know?

I saw the numpy.can_cast function and tried to use that to solve the issue (see PR #7834), however this function did not do what I expected it to.

A search for other solutions to see whether an array of floating point values is representable as integers turned up Numpy: Check if float array contains whole numbers on Stack Overflow. There are a few solutions given in that question, although each has its drawbacks. The most complete solution appears to be is_integer_ufunc, which is a ufunc written in C. Unfortunately this is not installable via pip/conda, and is not included in numpy.

Environment

In [2]: import xarray as xr
...: xr.show_versions()
/home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS

commit: None
python: 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: ('en_AU', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.1

xarray: 2023.4.2
pandas: 2.0.1
numpy: 1.24.3
scipy: None
netCDF4: 1.6.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.4.1
distributed: 2023.4.1
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: 7.3.1
mypy: 1.3.0
IPython: 8.12.0
sphinx: 4.3.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions