Skip to content

copy / deepcopy not deepcopying coords? #1463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
robintibor opened this issue Jun 22, 2017 · 8 comments
Closed

copy / deepcopy not deepcopying coords? #1463

robintibor opened this issue Jun 22, 2017 · 8 comments

Comments

@robintibor
Copy link

I don't know if this is intentional, I thought that arr.copy(deep=True) or deepcopy(arr) would give me completely independent copies of a DateArray, but this seems not be the case?

>>> import xarray as xr
>>> xarr1 = xr.DataArray([1,2], coords=dict(x=[0,1]), dims=('x',))
>>> xarr1.x.data[0]
0
>>> xarr2 = xarr1.copy(deep=True) #xarr2 = deepcopy(xarr1) -> leads to same result
>>> xarr2.x.data[0] = -1
>>> xarr1.x.data[0]
-1

How can I create completely independent copies of a DateArray? I wrote a function for this, but don't know if this really always does what I expect and if there is a more elegant way?

def deepcopy_xarr(xarr):
    """
    Deepcopy for xarray that makes sure coords and attrs
    are properly deepcopied.
    With normal copy method from xarray, when i mutated
    xarr.coords[coord].data it would also mutate in the copy
    and vice versa.
    Parameters
    ----------
    xarr: DateArray

    Returns
    -------
    xcopy: DateArray
        Deep copy of xarr
    """
    xcopy = xarr.copy(deep=True)

    for dim in xcopy.coords:
        xcopy.coords[dim].data = np.copy(xcopy.coords[dim].data)
    xcopy.attrs = deepcopy(xcopy.attrs)
    for attr in xcopy.attrs:
        xcopy.attrs[attr] = deepcopy(xcopy.attrs[attr])
    return xcopy
@shoyer
Copy link
Member

shoyer commented Jun 22, 2017

This seems like a bug.

I suspect the problem is in Variable.copy on these lines (which should probably just be removed). Coordinates store their data in pandas.Index objects, which are supposed to be immutable. But apparently that's not necessarily the case.

@fujiisoup
Copy link
Member

We do not allow to assign value in IndexVariable as pandas.Index is immutable (assigning value raises a TypeError),
but we can actually do this from .data attribute (this line).
(Our IndexVariable is not immutable...)

I think we should copy also IndexVariable not keeping the original reference.

@pletchm
Copy link
Contributor

pletchm commented Apr 12, 2019

I'd like to take a shot fixing this bug unless someone else already is working on it. Would that be alright?

@dcherian
Copy link
Contributor

go for it! @pletchm .

Feel free to open a PR or ask questions if you need help.

@max-sixty
Copy link
Collaborator

Great @pletchm ! This is a example of a recent similar issue: https://github.com/pydata/xarray/pull/2839/files

@max-sixty
Copy link
Collaborator

This also doesn't work for ._replace: https://github.com/pydata/xarray/blob/master/xarray/core/dataarray.py#L296

So my comment here isn't really correct: #3086 (comment)

@pletchm should I have a go at a PR? Happy to take this and you can take another one; I have lots of time atm

@shoyer
Copy link
Member

shoyer commented Jul 9, 2019

I think this was fixed by #2936. Certainly I can't reproduce the example in the first comment here any more.

@max-sixty
Copy link
Collaborator

max-sixty commented Jul 9, 2019

This is fixed! It's not allowing attrs to be passed into _replace. I'll open a new issue I think that's OK, since attrs are on the variable rather than the DataArray

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants