Skip to content

Key encoding for dbm-style databases #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alimanfoo opened this issue Feb 27, 2017 · 3 comments
Closed

Key encoding for dbm-style databases #133

alimanfoo opened this issue Feb 27, 2017 · 3 comments
Labels
enhancement New features or improvements release notes done Automatically applied to PRs which have release notes.
Milestone

Comments

@alimanfoo
Copy link
Member

There are a number of embedded key-value stores available, including the built-in dbm-style databases provided by the Python standard library, as well as BerkeleyDB and Kyoto Cabinet. These provide a MutableMapping interface but generally expect keys to be bytes objects not strings, and will return bytes objects when asked for an iterator over keys. Zarr uses text strings for keys currently, and so these key-value databases cannot be used directly with Zarr, however it would be nice and should be straighforward to provide some support.

There are two options. (1) Provide a MutableMapping wrapper class that performs key encoding/decoding (e.g., 'zarr.storage.KeyValueStore' or 'zarr.storage.DbmStore'). (2) Add a key_encoding option to the Array and Group classes, which would be None by default (preserving current behaviour to use text strings as keys) but if not None would be used to encode and decode keys outside of the store.

@alimanfoo alimanfoo added this to the v2.2 milestone Feb 27, 2017
@alimanfoo alimanfoo changed the title Key encoding Key encoding for dbm-style databases Feb 27, 2017
@alimanfoo
Copy link
Member Author

I'm leaning towards (2) as it seems reasonable and avoids yet another layer of indirection for the user.

@jakirkham
Copy link
Member

One thing to consider is whether this will open the door to handling a variety of different encodings.

Also related, will the encoding information be stored somehow?

@alimanfoo
Copy link
Member Author

alimanfoo commented Feb 27, 2017 via email

@alimanfoo alimanfoo mentioned this issue Apr 6, 2017
@alimanfoo alimanfoo removed this from the v2.2 milestone Oct 31, 2017
@alimanfoo alimanfoo added this to the v2.2 milestone Nov 16, 2017
@alimanfoo alimanfoo mentioned this issue Nov 16, 2017
@alimanfoo alimanfoo added enhancement New features or improvements release notes done Automatically applied to PRs which have release notes. labels Nov 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features or improvements release notes done Automatically applied to PRs which have release notes.
Projects
None yet
Development

No branches or pull requests

2 participants