Skip to content

to_netcdf / open_dataset is not idempotent #4512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
MVivien opened this issue Oct 15, 2020 · 1 comment
Open

to_netcdf / open_dataset is not idempotent #4512

MVivien opened this issue Oct 15, 2020 · 1 comment

Comments

@MVivien
Copy link

MVivien commented Oct 15, 2020

What happened:
I created a Dataset from a Dataarray with a data name equal to its dimension name and no coordinate. When saving the Dataset as netcdf and opening that netcdf as a Dataset again the opened Dataset does not have any data variable and the actual variable has become a coordinate.

What you expected to happen:
I would expect the to_netcdf / open_dataset process to be idempotent and obtain a Dataset that is identical to the one I saved as netcdf.

Minimal Complete Verifiable Example:

import xarray as xr

da = xr.DataArray(
    [1, 2, 3, 4],
    dims=['lat'],
    name='lat'
)
ds = da.to_dataset()

ds.to_netcdf('bug.nc')
ds2 = xr.open_dataset('bug.nc')

print(ds)
print(ds2)

Output

<xarray.Dataset>
Dimensions:  (lat: 4)
Dimensions without coordinates: lat
Data variables:
    lat      (lat) int64 1 2 3 4

<xarray.Dataset>
Dimensions:  (lat: 4)
Coordinates:
  * lat      (lat) int64 1 2 3 4
Data variables:
    *empty*

Anything else we need to know?:

Environment:

Output of xr.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.12 |Anaconda, Inc.| (default, Sep 8 2020, 17:50:39)
[GCC Clang 10.0.0 ]
python-bits: 64
OS: Darwin
OS-release: 19.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.16.1
pandas: 1.1.3
numpy: 1.19.2
scipy: 1.5.2
netCDF4: 1.5.4
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.8.4
iris: None
bottleneck: None
dask: 2.30.0
distributed: None
matplotlib: 3.1.3
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 50.3.0.post20201006
pip: 20.2.3
conda: None
pytest: 6.1.0
IPython: 5.8.0
sphinx: None

@dcherian
Copy link
Contributor

This is the same bug as in #4108 (comment)

<xarray.Dataset>
Dimensions:  (lat: 4)
Dimensions without coordinates: lat
Data variables:
    lat      (lat) int64 1 2 3 4

This isn't xarray's data model IIUC. Variables with the same name as dimensions are treated as coordinate variables (or indexed dimensions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants