Skip to content

Commit 5c71212

Browse files
grlee77Carreaujoshmoore
authored
Create a Base store class for Zarr Store (update) (#789)
* fix conflicts * cleanup naming * zip move * fix erasability test * test for warning * please flake * remove uncovered lines * remove uncovered lines in tests * pragma no cover for exceptional case * minor docstring fixes add assert statements to test_capabilities * pep8 fix * avoid NumPy 1.21.0 due to numpy/numpy#19325 * move Store class and some helper functions to zarr._storage.store update version in Store docstring * BUG: ABSStore should inherit from Store * pep8 fix * TST: make CustomMapping a subclass of Store TST: initialize stores with KVStore(dict()) instead of bare dict() * update version mentioned in Store docstring * update version mentioned in warning message * use Store._ensure_store in Attributes class ensures Attributes.store is a Store * TST: add Attributes test case ensuring store gets coerced to a Store * use Store._ensure_store in normalize_store_arg ensures open_array, etc can work when the user supplies a dict * TST: make sure high level creation functions also work when passed a dict for store * TST: add test case with group initialized from dict * TST: add test case with Array initialized from dict * change CustomMapping back to type object, not Store want to test the non-Store code path in _ensure_store * pep8 fixes * update/fix new hierarchy test case to complete code coverage * create a BaseStore parent for Store BaseStore does not have the listdir or rmdir methods cleaned up some type declerations, making sure mypy passes * flake8 * restore is_erasable check to rmdir function Otherwise the save_array doc example fails to write to a ZipStore Co-authored-by: Matthias Bussonnier <[email protected]> Co-authored-by: Josh Moore <[email protected]> Co-authored-by: jmoore <[email protected]>
1 parent 523dbb8 commit 5c71212

19 files changed

+794
-583
lines changed

docs/tutorial.rst

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ print some diagnostics, e.g.::
176176
Read-only : False
177177
Compressor : Blosc(cname='zstd', clevel=3, shuffle=BITSHUFFLE,
178178
: blocksize=0)
179-
Store type : builtins.dict
179+
Store type : zarr.storage.KVStore
180180
No. bytes : 400000000 (381.5M)
181181
No. bytes stored : 3379344 (3.2M)
182182
Storage ratio : 118.4
@@ -268,7 +268,7 @@ Here is an example using a delta filter with the Blosc compressor::
268268
Read-only : False
269269
Filter [0] : Delta(dtype='<i4')
270270
Compressor : Blosc(cname='zstd', clevel=1, shuffle=SHUFFLE, blocksize=0)
271-
Store type : builtins.dict
271+
Store type : zarr.storage.KVStore
272272
No. bytes : 400000000 (381.5M)
273273
No. bytes stored : 1290562 (1.2M)
274274
Storage ratio : 309.9
@@ -805,8 +805,10 @@ Here is an example using S3Map to read an array created previously::
805805
Order : C
806806
Read-only : False
807807
Compressor : Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)
808-
Store type : fsspec.mapping.FSMap
808+
Store type : zarr.storage.KVStore
809809
No. bytes : 21
810+
No. bytes stored : 382
811+
Storage ratio : 0.1
810812
Chunks initialized : 3/3
811813
>>> z[:]
812814
array([b'H', b'e', b'l', b'l', b'o', b' ', b'f', b'r', b'o', b'm', b' ',
@@ -1274,7 +1276,7 @@ ratios, depending on the correlation structure within the data. E.g.::
12741276
Order : C
12751277
Read-only : False
12761278
Compressor : Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)
1277-
Store type : builtins.dict
1279+
Store type : zarr.storage.KVStore
12781280
No. bytes : 400000000 (381.5M)
12791281
No. bytes stored : 6696010 (6.4M)
12801282
Storage ratio : 59.7
@@ -1288,7 +1290,7 @@ ratios, depending on the correlation structure within the data. E.g.::
12881290
Order : F
12891291
Read-only : False
12901292
Compressor : Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0)
1291-
Store type : builtins.dict
1293+
Store type : zarr.storage.KVStore
12921294
No. bytes : 400000000 (381.5M)
12931295
No. bytes stored : 4684636 (4.5M)
12941296
Storage ratio : 85.4

mypy.ini

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
[mypy]
2-
python_version = 3.7
2+
python_version = 3.8
33
ignore_missing_imports = True
44
follow_imports = silent

pytest.ini

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,6 @@ doctest_optionflags = NORMALIZE_WHITESPACE ELLIPSIS IGNORE_EXCEPTION_DETAIL
33
addopts = --durations=10
44
filterwarnings =
55
error::DeprecationWarning:zarr.*
6+
error::UserWarning:zarr.*
67
ignore:PY_SSIZE_T_CLEAN will be required.*:DeprecationWarning
8+
ignore:The loop argument is deprecated since Python 3.8.*:DeprecationWarning

zarr/_storage/absstore.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
"""This module contains storage classes related to Azure Blob Storage (ABS)"""
22

33
import warnings
4-
from collections.abc import MutableMapping
54
from numcodecs.compat import ensure_bytes
65
from zarr.util import normalize_storage_path
6+
from zarr._storage.store import Store
77

88
__doctest_requires__ = {
99
('ABSStore', 'ABSStore.*'): ['azure.storage.blob'],
1010
}
1111

1212

13-
class ABSStore(MutableMapping):
13+
class ABSStore(Store):
1414
"""Storage class using Azure Blob Storage (ABS).
1515
1616
Parameters

zarr/_storage/store.py

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
from collections.abc import MutableMapping
2+
from typing import Any, List, Optional, Union
3+
4+
from zarr.util import normalize_storage_path
5+
6+
# v2 store keys
7+
array_meta_key = '.zarray'
8+
group_meta_key = '.zgroup'
9+
attrs_key = '.zattrs'
10+
11+
12+
class BaseStore(MutableMapping):
13+
"""Abstract base class for store implementations.
14+
15+
This is a thin wrapper over MutableMapping that provides methods to check
16+
whether a store is readable, writeable, eraseable and or listable.
17+
18+
Stores cannot be mutable mapping as they do have a couple of other
19+
requirements that would break Liskov substitution principle (stores only
20+
allow strings as keys, mutable mapping are more generic).
21+
22+
Having no-op base method also helps simplifying store usage and do not need
23+
to check the presence of attributes and methods, like `close()`.
24+
25+
Stores can be used as context manager to make sure they close on exit.
26+
27+
.. added: 2.11.0
28+
29+
"""
30+
31+
_readable = True
32+
_writeable = True
33+
_erasable = True
34+
_listable = True
35+
36+
def is_readable(self):
37+
return self._readable
38+
39+
def is_writeable(self):
40+
return self._writeable
41+
42+
def is_listable(self):
43+
return self._listable
44+
45+
def is_erasable(self):
46+
return self._erasable
47+
48+
def __enter__(self):
49+
if not hasattr(self, "_open_count"):
50+
self._open_count = 0
51+
self._open_count += 1
52+
return self
53+
54+
def __exit__(self, exc_type, exc_value, traceback):
55+
self._open_count -= 1
56+
if self._open_count == 0:
57+
self.close()
58+
59+
def close(self) -> None:
60+
"""Do nothing by default"""
61+
pass
62+
63+
def rename(self, src_path: str, dst_path: str) -> None:
64+
if not self.is_erasable():
65+
raise NotImplementedError(
66+
f'{type(self)} is not erasable, cannot call "rename"'
67+
) # pragma: no cover
68+
_rename_from_keys(self, src_path, dst_path)
69+
70+
@staticmethod
71+
def _ensure_store(store: Any):
72+
"""
73+
We want to make sure internally that zarr stores are always a class
74+
with a specific interface derived from ``BaseStore``, which is slightly
75+
different than ``MutableMapping``.
76+
77+
We'll do this conversion in a few places automatically
78+
"""
79+
from zarr.storage import KVStore # avoid circular import
80+
81+
if store is None:
82+
return None
83+
elif isinstance(store, BaseStore):
84+
return store
85+
elif isinstance(store, MutableMapping):
86+
return KVStore(store)
87+
else:
88+
for attr in [
89+
"keys",
90+
"values",
91+
"get",
92+
"__setitem__",
93+
"__getitem__",
94+
"__delitem__",
95+
"__contains__",
96+
]:
97+
if not hasattr(store, attr):
98+
break
99+
else:
100+
return KVStore(store)
101+
102+
raise ValueError(
103+
"Starting with Zarr 2.11.0, stores must be subclasses of "
104+
"BaseStore, if your store exposes the MutableMapping interface "
105+
f"wrap it in Zarr.storage.KVStore. Got {store}"
106+
)
107+
108+
109+
class Store(BaseStore):
110+
"""Abstract store class used by implementations following the Zarr v2 spec.
111+
112+
Adds public `listdir`, `rename`, and `rmdir` methods on top of BaseStore.
113+
114+
.. added: 2.11.0
115+
116+
"""
117+
def listdir(self, path: str = "") -> List[str]:
118+
path = normalize_storage_path(path)
119+
return _listdir_from_keys(self, path)
120+
121+
def rmdir(self, path: str = "") -> None:
122+
if not self.is_erasable():
123+
raise NotImplementedError(
124+
f'{type(self)} is not erasable, cannot call "rmdir"'
125+
) # pragma: no cover
126+
path = normalize_storage_path(path)
127+
_rmdir_from_keys(self, path)
128+
129+
130+
def _path_to_prefix(path: Optional[str]) -> str:
131+
# assume path already normalized
132+
if path:
133+
prefix = path + '/'
134+
else:
135+
prefix = ''
136+
return prefix
137+
138+
139+
def _rename_from_keys(store: BaseStore, src_path: str, dst_path: str) -> None:
140+
# assume path already normalized
141+
src_prefix = _path_to_prefix(src_path)
142+
dst_prefix = _path_to_prefix(dst_path)
143+
for key in list(store.keys()):
144+
if key.startswith(src_prefix):
145+
new_key = dst_prefix + key.lstrip(src_prefix)
146+
store[new_key] = store.pop(key)
147+
148+
149+
def _rmdir_from_keys(store: Union[BaseStore, MutableMapping], path: Optional[str] = None) -> None:
150+
# assume path already normalized
151+
prefix = _path_to_prefix(path)
152+
for key in list(store.keys()):
153+
if key.startswith(prefix):
154+
del store[key]
155+
156+
157+
def _listdir_from_keys(store: BaseStore, path: Optional[str] = None) -> List[str]:
158+
# assume path already normalized
159+
prefix = _path_to_prefix(path)
160+
children = set()
161+
for key in list(store.keys()):
162+
if key.startswith(prefix) and len(key) > len(prefix):
163+
suffix = key[len(prefix):]
164+
child = suffix.split('/')[0]
165+
children.add(child)
166+
return sorted(children)

zarr/attrs.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
from collections.abc import MutableMapping
22

33
from zarr.meta import parse_metadata
4+
from zarr._storage.store import Store
45
from zarr.util import json_dumps
56

67

@@ -26,7 +27,7 @@ class Attributes(MutableMapping):
2627

2728
def __init__(self, store, key='.zattrs', read_only=False, cache=True,
2829
synchronizer=None):
29-
self.store = store
30+
self.store = Store._ensure_store(store)
3031
self.key = key
3132
self.read_only = read_only
3233
self.cache = cache

0 commit comments

Comments
 (0)