-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
MCVE Code Sample
data here: https://www.dropbox.com/sh/8eist9mmlf41mpc/AAB8yp6ERz-b4VYozL8tsj-ma?dl=0
import xarray as xr
import numpy as np
print(xr.__version__) #
ds1 = xr.open_dataset('tas_Amon_NorESM1-ME_rcp26_r1i1p1_200601-206012.nc')
ds2 = xr.open_dataset('tas_Amon_NorESM1-ME_rcp26_r1i1p1_206101-210112.nc')
print(np.allclose(ds1.lat , ds2.lat), np.allclose(ds1.lat_bnds , ds2.lat_bnds)) # True, True
ds3 = xr.concat((ds1,ds2), dim='time')
print(ds3.lat.shape == ds1.lat.shape) # False
Expected Output
0.14.1
True True
True
since ds3.lat should be identical to ds1.lat
Problem Description
I've encountered a particular NetCDF dataset which is not handled correctly by the Xarray concat
operation. It's a climate dataset with 96 latitude points which has been split into two time segments. After concatenation (dim = 'time') there are suddenly 142 latitude points even though the latitude arrays are completely identical AFAIK.
As a workaround, I've tried to reindex the result (ds3
) as follows
ds4 = ds3.reindex_like(ds1.drop_dims('time'))
but that yields an incomplete field after the year 2061. This can be seen by issuing:
ds4.tas.isel(time=-1).plot()
with white areas indicating missing data. There is no missing data in the source files.
Output of xr.show_versions()
xarray: 0.14.1
pandas: 0.25.1
numpy: 1.16.4
scipy: 1.3.1
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.0b1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.3.0
distributed: 2.3.2
matplotlib: 3.0.2
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 41.0.1
pip: 19.2.2
conda: 4.8.1
pytest: 5.0.1
IPython: 7.7.0
sphinx: 2.1.2