Skip to content

Commit 3db4176

Browse files
authored
docs: ZIP-related tweaks (#1641)
* docs: use 'ZIP archive' instead of 'zip file'; clarify utility of caching in s3 + ZIP example; style * docs: update release notes, correct spelling of greg lee's name in past release notes, and fix markup in past release notes * docs: use 'ZIP archive' instead of 'zip file'; clarify utility of caching in s3 + ZIP example; style * docs: update release notes, correct spelling of greg lee's name in past release notes, and fix markup in past release notes
1 parent 7449853 commit 3db4176

File tree

2 files changed

+24
-23
lines changed

2 files changed

+24
-23
lines changed

docs/release.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -62,8 +62,8 @@ Docs
6262
* Add Norman Rzepka to core-dev team.
6363
By :user:`Joe Hamman <jhamman>` :issue:`1630`.
6464

65-
* Added section about accessing zip files that are on s3.
66-
By :user:`Jeff Peck <jeffpeck10x>` :issue:`1613`.
65+
* Added section about accessing ZIP archives on s3.
66+
By :user:`Jeff Peck <jeffpeck10x>` :issue:`1613`, :issue:`1615`, and :user:`Davis Bennett <d-v-b>` :issue:`1641`.
6767

6868
* Add V3 roadmap and design document.
6969
By :user:`Joe Hamman <jhamman>` :issue:`1583`.
@@ -157,10 +157,10 @@ Maintenance
157157
By :user:`Davis Bennett <d-v-b>` :issue:`1462`.
158158

159159
* Style the codebase with ``ruff`` and ``black``.
160-
By :user:`Davis Bennett` <d-v-b> :issue:`1459`
160+
By :user:`Davis Bennett <d-v-b>` :issue:`1459`
161161

162162
* Ensure that chunks is tuple of ints upon array creation.
163-
By :user:`Philipp Hanslovsky` <hanslovsky> :issue:`1461`
163+
By :user:`Philipp Hanslovsky <hanslovsky>` :issue:`1461`
164164

165165
.. _release_2.15.0:
166166

@@ -548,7 +548,7 @@ Maintenance
548548
By :user:`Saransh Chopra <Saransh-cpp>` :issue:`1079`.
549549

550550
* Remove option to return None from _ensure_store.
551-
By :user:`Greggory Lee <grlee77>` :issue:`1068`.
551+
By :user:`Gregory Lee <grlee77>` :issue:`1068`.
552552

553553
* Fix a typo of "integers".
554554
By :user:`Richard Scott <RichardScottOZ>` :issue:`1056`.
@@ -566,7 +566,7 @@ Enhancements
566566
Since the format is not yet finalized, the classes and functions are not
567567
automatically imported into the regular `zarr` name space. Setting the
568568
`ZARR_V3_EXPERIMENTAL_API` environment variable will activate them.
569-
By :user:`Greggory Lee <grlee77>`; :issue:`898`, :issue:`1006`, and :issue:`1007`
569+
By :user:`Gregory Lee <grlee77>`; :issue:`898`, :issue:`1006`, and :issue:`1007`
570570
as well as by :user:`Josh Moore <joshmoore>` :issue:`1032`.
571571

572572
* **Create FSStore from an existing fsspec filesystem**. If you have created
@@ -688,7 +688,7 @@ Enhancements
688688
higher-level array creation and convenience functions still accept plain
689689
Python dicts or other mutable mappings for the ``store`` argument, but will
690690
internally convert these to a ``KVStore``.
691-
By :user:`Greggory Lee <grlee77>`; :issue:`839`, :issue:`789`, and :issue:`950`.
691+
By :user:`Gregory Lee <grlee77>`; :issue:`839`, :issue:`789`, and :issue:`950`.
692692

693693
* Allow to assign array ``fill_values`` and update metadata accordingly.
694694
By :user:`Ryan Abernathey <rabernat>`, :issue:`662`.
@@ -835,7 +835,7 @@ Bug fixes
835835
~~~~~~~~~
836836

837837
* Fix FSStore.listdir behavior for nested directories.
838-
By :user:`Greggory Lee <grlee77>`; :issue:`802`.
838+
By :user:`Gregory Lee <grlee77>`; :issue:`802`.
839839

840840
.. _release_2.9.4:
841841

@@ -919,7 +919,7 @@ Bug fixes
919919
By :user:`Josh Moore <joshmoore>`; :issue:`781`.
920920

921921
* avoid NumPy 1.21.0 due to https://github.com/numpy/numpy/issues/19325
922-
By :user:`Greggory Lee <grlee77>`; :issue:`791`.
922+
By :user:`Gregory Lee <grlee77>`; :issue:`791`.
923923

924924
Maintenance
925925
~~~~~~~~~~~
@@ -931,7 +931,7 @@ Maintenance
931931
By :user:`Elliott Sales de Andrade <QuLogic>`; :issue:`799`.
932932

933933
* TST: add missing assert in test_hexdigest.
934-
By :user:`Greggory Lee <grlee77>`; :issue:`801`.
934+
By :user:`Gregory Lee <grlee77>`; :issue:`801`.
935935

936936
.. _release_2.8.3:
937937

docs/tutorial.rst

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -774,7 +774,7 @@ the following code::
774774

775775
Any other compatible storage class could be used in place of
776776
:class:`zarr.storage.DirectoryStore` in the code examples above. For example,
777-
here is an array stored directly into a Zip file, via the
777+
here is an array stored directly into a ZIP archive, via the
778778
:class:`zarr.storage.ZipStore` class::
779779

780780
>>> store = zarr.ZipStore('data/example.zip', mode='w')
@@ -798,12 +798,12 @@ Re-open and check that data have been written::
798798
[42, 42, 42, ..., 42, 42, 42]], dtype=int32)
799799
>>> store.close()
800800

801-
Note that there are some limitations on how Zip files can be used, because items
802-
within a Zip file cannot be updated in place. This means that data in the array
801+
Note that there are some limitations on how ZIP archives can be used, because items
802+
within a ZIP archive cannot be updated in place. This means that data in the array
803803
should only be written once and write operations should be aligned with chunk
804804
boundaries. Note also that the ``close()`` method must be called after writing
805805
any data to the store, otherwise essential records will not be written to the
806-
underlying zip file.
806+
underlying ZIP archive.
807807

808808
Another storage alternative is the :class:`zarr.storage.DBMStore` class, added
809809
in Zarr version 2.2. This class allows any DBM-style database to be used for
@@ -846,7 +846,7 @@ respectively require the `redis-py <https://redis-py.readthedocs.io>`_ and
846846
`pymongo <https://api.mongodb.com/python/current/>`_ packages to be installed.
847847

848848
For compatibility with the `N5 <https://github.com/saalfeldlab/n5>`_ data format, Zarr also provides
849-
an N5 backend (this is currently an experimental feature). Similar to the zip storage class, an
849+
an N5 backend (this is currently an experimental feature). Similar to the ZIP storage class, an
850850
:class:`zarr.n5.N5Store` can be instantiated directly::
851851

852852
>>> store = zarr.N5Store('data/example.n5')
@@ -1000,12 +1000,13 @@ separately from Zarr.
10001000

10011001
.. _tutorial_copy:
10021002

1003-
Accessing Zip Files on S3
1004-
~~~~~~~~~~~~~~~~~~~~~~~~~
1003+
Accessing ZIP archives on S3
1004+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10051005

1006-
The built-in `ZipStore` will only work with paths on the local file-system, however
1007-
it is also possible to access ``.zarr.zip`` data on the cloud. Here is an example of
1008-
accessing a zipped Zarr file on s3:
1006+
The built-in :class:`zarr.storage.ZipStore` will only work with paths on the local file-system; however
1007+
it is possible to access ZIP-archived Zarr data on the cloud via the `ZipFileSystem <https://filesystem-spec.readthedocs.io/en/latest/_modules/fsspec/implementations/zip.html>`_
1008+
class from ``fsspec``. The following example demonstrates how to access
1009+
a ZIP-archived Zarr group on s3 using `s3fs <https://s3fs.readthedocs.io/en/latest/>`_ and ``ZipFileSystem``:
10091010

10101011
>>> s3_path = "s3://path/to/my.zarr.zip"
10111012
>>>
@@ -1014,15 +1015,15 @@ accessing a zipped Zarr file on s3:
10141015
>>> fs = ZipFileSystem(f, mode="r")
10151016
>>> store = FSMap("", fs, check=False)
10161017
>>>
1017-
>>> # cache is optional, but may be a good idea depending on the situation
1018+
>>> # caching may improve performance when repeatedly reading the same data
10181019
>>> cache = zarr.storage.LRUStoreCache(store, max_size=2**28)
10191020
>>> z = zarr.group(store=cache)
10201021

10211022
This store can also be generated with ``fsspec``'s handler chaining, like so:
10221023

10231024
>>> store = zarr.storage.FSStore(url=f"zip::{s3_path}", mode="r")
10241025

1025-
This can be especially useful if you have a very large ``.zarr.zip`` file on s3
1026+
This can be especially useful if you have a very large ZIP-archived Zarr array or group on s3
10261027
and only need to access a small portion of it.
10271028

10281029
Consolidating metadata
@@ -1161,7 +1162,7 @@ re-compression, and so should be faster. E.g.::
11611162
└── spam (100,) int64
11621163
>>> new_root['foo/bar/baz'][:]
11631164
array([ 0, 1, 2, ..., 97, 98, 99])
1164-
>>> store2.close() # zip stores need to be closed
1165+
>>> store2.close() # ZIP stores need to be closed
11651166

11661167
.. _tutorial_strings:
11671168

0 commit comments

Comments
 (0)