-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
In recent versions of numpy, calling numpy.asarray(arr, dtype=numpy.int64)
will raise a warning if the input array contains numpy.nan
values. This line of code is used in xarray.coding.times.cast_to_int_if_safe(num)
:
def cast_to_int_if_safe(num) -> np.ndarray:
int_num = np.asarray(num, dtype=np.int64)
if (num == int_num).all():
num = int_num
return num
The function still returns the correct True/False values regardless of the warning.
What did you expect to happen?
No warning to be printed
Minimal Complete Verifiable Example
import numpy
import xarray
one_day = numpy.timedelta64(1, 'D')
nat = numpy.timedelta64('nat')
timedelta_values = (numpy.arange(5) * one_day).astype('timedelta64[ns]')
timedelta_values[2] = nat
timedelta_values[4] = nat
dataset = xarray.Dataset(data_vars={
'timedeltas': xarray.DataArray(data=timedelta_values, dims=['x'])
})
dataset.to_netcdf('out.nc')
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
$ python3 safe_cast.py
/home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/xarray/coding/times.py:618: RuntimeWarning: invalid value encountered in cast
int_num = np.asarray(num, dtype=np.int64)
$ ncdump out.nc
netcdf out {
dimensions:
x = 5 ;
variables:
double timedeltas(x) ;
timedeltas:_FillValue = NaN ;
timedeltas:units = "days" ;
data:
timedeltas = 0, 1, _, 3, _ ;
}
Anything else we need to know?
I saw the numpy.can_cast
function and tried to use that to solve the issue (see PR #7834), however this function did not do what I expected it to.
A search for other solutions to see whether an array of floating point values is representable as integers turned up Numpy: Check if float array contains whole numbers on Stack Overflow. There are a few solutions given in that question, although each has its drawbacks. The most complete solution appears to be is_integer_ufunc, which is a ufunc written in C. Unfortunately this is not installable via pip/conda, and is not included in numpy.
Environment
In [2]: import xarray as xr
...: xr.show_versions()
/home/hea211/projects/emsarray/.conda/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
INSTALLED VERSIONS
commit: None
python: 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-73-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: ('en_AU', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.9.1
xarray: 2023.4.2
pandas: 2.0.1
numpy: 1.24.3
scipy: None
netCDF4: 1.6.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.4.1
distributed: 2023.4.1
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.5.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: 7.3.1
mypy: 1.3.0
IPython: 8.12.0
sphinx: 4.3.2