-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
Description
As noted in pydata/xarray#1769, dask currently does not expose any way to determine the computed type of a dask array containing MaskedArray or sparse blocks.
This is untenable for building complex codebases using dask: we need to know the type of subarrays, so errors can be raised at graph building instead of compute time. For masked arrays, this is merely inconvenient, but for mixing up dense/sparse arrays this is a very serious concern, because it could entail loading very large arrays into memory.
I would suggest a simple hierarchy:
BaseArray
: abstract base class for all dask array typesArray(BaseArray)
: base numpy.ndarray elementsMaskedArray(BaseArray)
:numpy.ma.MaskedArray
elementsSparseArray(BaseArray)
: sparse array elements.