All-or-nothing generation of multiple Groups/Arrays/.. with zarr: Possible Approaches? #3094
FabricioArendTorres
started this conversation in
General
Replies: 1 comment 1 reply
-
Icechunk, which builds on top of zarr, provides exactly this.
This might not be an option, depending on what file / object storage system you're using. Object stores like S3 don't provide multi-object, atomic updates so Zarr alone isn't enough. Consolidated metadata can help with a subset of use cases where you only ever append new data (since the update to the arrays can be done ahead of time and the update to the consolidated metadata file is atomic). But icechunk is probably the way to go. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
Context
I'm working on a data format built on top off zarr / using zarr for array and meta data storage.
Part of this involves the creation of a zarr hierarchy, writing arrays, metadata etc.
I would like to bundle them up into transactions, to ensure that I never arrive at an inconsistent state on failure (e.g. system crash).
Question
I was wondering on how to approach that in zarr?
Options I considered:
Build node somewhere else, then move
The V2 copy implementations do not seem to be atomic on a quick glance,
so first building up the group node in a tmp group and then moving it does not resolve the issue.
I assume this is also highly dependent on the store.
Create Locks in zarr store
Another approach would be the creation of lock files.
My first idea would be to create temporary groups as lock files.
Is there a nicer approach for a store-agnostic locking?
I would like to avoid messing with the lower-level zarr metadata.
Other potential issues?
I guess in all cases I might run into issues with the async approach of v3, or have to force synchronization for the transactions.
Happy to hear any opinions on this.
Thank you and best regards,
Fabricio
Beta Was this translation helpful? Give feedback.
All reactions