Skip to content

bug in save for variable length arrays? #1035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
JamiePringle opened this issue May 23, 2022 · 2 comments
Open

bug in save for variable length arrays? #1035

JamiePringle opened this issue May 23, 2022 · 2 comments
Labels
bug Potential issues with the zarr-python library

Comments

@JamiePringle
Copy link

This is related to the issue in #691, but with more specifics. Saving a variable length (ragged) array with zarr.save fails, but creating the file without the convenience function works. I am running python 3.9.12 and zarr 2.11.3, and when I try to run

import zarr
import numcodecs
import numpy as np

z = zarr.empty(4, dtype=object, object_codec=numcodecs.VLenArray(int))
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
z[3] = np.array([1,1])

zarr.save('jnk.zarr',z)

It fails with a traceback that culminates with

File ~/anaconda3/envs/py3_parcels_mpi_bleedingApr2022/lib/python3.9/site-packages/zarr/storage.py:427, in _init_array_metadata(store, shape, chunks, dtype, compressor, fill_value, order, overwrite, path, chunk_store, filters, object_codec, dimension_separator)
    424 if object_codec is None:
    425     if not filters:
    426         # there are no filters so we can be sure there is no object codec
--> 427         raise ValueError('missing object_codec for object array')
    428     else:
    429         # one of the filters may be an object codec, issue a warning rather
    430         # than raise an error to maintain backwards-compatibility
    431         warnings.warn('missing object_codec for object array; this will raise a '
    432                       'ValueError in version 3.0', FutureWarning)

ValueError: missing object_codec for object array

This makes it rather hard to use ragged arrays... Am I doing something dumb? Or is something broken? What I really need to do is write ragged arrays to zarr data stores. When I type z.filters it returns [VLenArray(dtype='<i8')].

However, if I manually create the data store, it works fine -- the following code works:

import zarr
import numcodecs
import numpy as np

store = zarr.DirectoryStore('jnkStore.zarr')
root=zarr.group(store=store)

z = root.empty(shape=(4,),name='z',dtype=object, object_codec=numcodecs.VLenArray(int))
z[0] = np.array([1, 3, 5])
z[1] = np.array([4])
z[2] = np.array([7, 9, 14])
z[3] = np.array([1,1])

@jakirkham
Copy link
Member

Can you please also include conda list or pip list as appropriate?

@JamiePringle
Copy link
Author

Attached is the output of conda list. Cheers, Jamie

conda_list.txt
t

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

3 participants