implement normalize_token #3378

dcherian · 2019-10-07T17:07:29Z

dcherian · 2019-10-11T15:55:23Z

How should this be implemented?

crusaderky · 2019-10-12T11:27:52Z

https://docs.dask.org/en/latest/custom-collections.html#implementing-deterministic-hashing

@normalize_token.register(Dataset)
def tokenize_dataset(ds):
    return Dataset, ds._variables, ds._coord_names, ds._attrs

@normalize_token.register(DataArray)
def tokenize_dataarray(da):
    return DataArray, ds._variable, ds._coords, ds._name

# Note: the @singledispatch for IndexVariable must be defined before the one for Variable
@normalize_token.register(IndexVariable)
def tokenize_indexvariable(v):
    # Don't waste time converting pd.Index to np.ndarray
    return IndexVariable, v._dims, v._data.array, v._attrs

@normalize_token.register(Variable)
def tokenize_variable(v):
    # Note: it's v.data, not v._data, in order to cope with the 
    # wrappers around NetCDF and the like
    return Variable, v._dims, v.data, v._attrs

You'll need to write a dummy normalize_token for when dask is not installed.

Unit tests:

running tokenize() twice on the same object returns the same result
changing the content of a data_var (or the variable, for DataArray) changes the output
changing the content of a coord changes the output
changing attrs, name, or dimension names change the output
whether a variable is a data_var or a coord changes the output
dask arrays aren't computed
non-numpy, non-dask NEP18 data is not converted to numpy
works with xarray's fancy wrappers around NetCDF and the like

dcherian mentioned this issue Oct 7, 2019

map_blocks #3276

Merged

4 tasks

dcherian mentioned this issue Oct 25, 2019

__dask_tokenize__ #3446

Merged

12 tasks

crusaderky closed this as completed in #3446 Oct 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

implement normalize_token #3378

implement normalize_token #3378

dcherian commented Oct 7, 2019

dcherian commented Oct 11, 2019

Uh oh!

crusaderky commented Oct 12, 2019 •

edited

Loading

Uh oh!

Uh oh!

implement normalize_token #3378

implement normalize_token #3378

Comments

dcherian commented Oct 7, 2019

dcherian commented Oct 11, 2019

Uh oh!

crusaderky commented Oct 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

crusaderky commented Oct 12, 2019 •

edited

Loading