MultiZarrToZarr combination associativity; issues in map/reduce workflow

This may end up being a super obvious mistake for someone that's a bit more familiar with the expectations of the `kerchunk` API, so forgive me if this issue is being lodged out of ignorance!

The workflow I'm dealing with is attempting to combine kerchunk datasets via map/reduce. The reduce step involves an assumption of associativity among things combined, as different numbers of workers for the same job will mean different combinations being created as each worker's kerchunk refs are combined via `MultiZarrToZarr.translate()` and then again on the driver node to hopefully get out a single ref for all underlying datasets.

# Assumption:
MultiZarrToZarr combination and then translation is associative. Combining refs can happen in any order and then be combined again at the end of the process without different results.

# Reality: 
I'm seeing radically different results depending on how many workers are used, suggesting that associativity of this combination is not a safe assumption!

This is roughly what that workflow looks like:
Taking a list of netcdf files (with the same dimensions) and translate:
```python
chunks = NetCDF3ToZarr(
    url,
    inline_threshold=inline_threshold,
    storage_options=storage_options,
    **(kerchunk_open_kwargs or {}),
)
refs = [chunks.translate()]
```

These dictionaries are distributed across workers, each of which builds a `MultiZarrToZarr` instance (Map):
```python
# list[dict]) -> MultiZarrToZarr
MultiZarrToZarr(refs)
```

Each worker's `MultiZarrToZarr` is then translated and merged via another `MultiZarrToZarr` (reduce):
```python
# Sequence[MultiZarrToZarr]) -> MultiZarrToZarr
refs = [a.translate() for a in multizarrtozarr]
accumulator = MultiZarrToZarr(refs)
```

Finally, the results are written out as a single ref:
```python
# MultiZarrToZarr -> dict
accumulator.translate()
```

At this point, the results differ. Some statistics I've pulled from the resulting ref files. Note, especially, the different Non-NAN value count:
```
Stats for analysis_error (single worker):
Mean: 0.3497554361820221
Median: 0.3499999940395355
Standard Deviation: 0.004802505951374769
Non-NaN Count: 137376947

Stats for analysis_error (4 workers):
Mean: 0.3497870862483978
Median: 0.3499999940395355
Standard Deviation: 0.004866031929850578
Non-NaN Count: 109904479
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MultiZarrToZarr combination associativity; issues in map/reduce workflow #416

Assumption:

Reality:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MultiZarrToZarr combination associativity; issues in map/reduce workflow #416

Description

Assumption:

Reality:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions