Skip to content

Xarray serialization warning when saving dataset #853

@tomwhite

Description

@tomwhite

From #785:

import sgkit as sg
import sgkit.io.vcf as sgvcf
sgvcf.vcf_to_zarr("sgkit/tests/io/vcf/data/sample.vcf.gz", "sample.vcf.gz.zarr")
ds = sg.load_dataset("sample.vcf.gz.zarr")
sg.save_dataset(ds, "sample2.vcf.gz.zarr", mode="w")

prints the warning:

SerializationWarning: variable None has data in the form of a dask array with dtype=object, which means it is being loaded into memory to determine a data type that can be safely stored on disk. To avoid this, coerce this variable to a fixed-size dtype with astype() before saving it.

There is an upstream xarray issue here: pydata/xarray#5769. #643 is related too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingupstreamUsed when our build breaks due to upstream changes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions