Skip to content

serialized ndarray not writeable #1368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
caseyjlaw opened this issue Aug 30, 2017 · 4 comments
Closed

serialized ndarray not writeable #1368

caseyjlaw opened this issue Aug 30, 2017 · 4 comments

Comments

@caseyjlaw
Copy link
Contributor

caseyjlaw commented Aug 30, 2017

I'm using dask and numba to accelerate python on a small cluster. A common dask/distributed pattern for me is to submit a series of functions in a pipeline, like:

data = cl.submit(read_data)
new_data = cl.submit(correct, data)

However, some of my functions transform the data in place using numba. That is a problem, since it seems that an ndarray is not writable after serialization. So (continuing from above):

> data = data.result()
> data.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In contrast, if I run the function locally, I get:

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

My workaround is to use np.require to wrap the data in my numba function, but I thought this might trip others up, too. Is there a reason a serialized ndarray is made nonwriteable?

@mrocklin
Copy link
Member

More broadly, Dask tends to assume that your functions don't mutate state. For example, it reserves the right to run a function twice if something bad happens.

So putting on my "lets be safe" hat, I'll suggest that you avoid mutating data that dask knows about.

@mrocklin
Copy link
Member

More specifically I don't remember exactly why the memory is non-writeable. I suspect that we're basing the array off of some bytestring pulled off of a tornado socket and perhaps that bytestring is immutable?

@caseyjlaw
Copy link
Contributor Author

Ok, I thought that might be the case. Hopefully this helps others understand, should the issue arise for them. I'll close this.

@jakirkham
Copy link
Member

FWIW this inconsistency between schedulers is being discussed in issue ( #1978 ). If you have any thoughts on how you would like to see this inconsistency addressed, please feel free to share in that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants