-
Notifications
You must be signed in to change notification settings - Fork 49
Description
I'm trying to run data-pipeline/caids/get_caids.py
with a different dataset, but am encountering an issue.
When running lines 139 to 147 of the script (all_part_urls
, and completed_part_urls
), Hail errors out with the message:
await f.url() async for f in await fs.listfiles(sharded_vcf_url) if f.name().startswith("part-")
^^^^^^
AttributeError: 'GoogleStorageFileListEntry' object has no attribute 'name'. Did you mean: '_name'?
Looking at the hail source code on line 483 of hail/python/hailtop/aiocloud/aiogoogle/client/storage_client.py
I can see that the GoogleStorageFileListEntry
class indeed does not have an async name
method.
It seems like you were able to run these scripts to create the gnomad_v4 version of CAIDS data set earlier this year. I note this PR where there were updates to the get_caids.py
script (and mentions that there have been "a number of Hail utils that have either been changed, removed or replaced since its last update"). I'd be grateful for any suggestions you have for addressing this 😃
I'm using GCP infrastructure with python v3.11
and hail v0.2.132
.