-
Notifications
You must be signed in to change notification settings - Fork 37
Description
The all_fields.vcf
file contains lots of examples where we explicitly state that an INFO key is missing, rather than omitting the key, e.g. II1=.
and II2=.,.
here. This was handled before #1190 because we treating non-present INFO keys as PAD values and only these explicit "key=." values as missing.
I don't think it's a useful distinction, and likely to cause more problems downstream if we distinguish between these two types of missingness. I'm fairly clear that regarding missing keys as dimension padding isn't helpful, in any case.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT s1 s2
1 1 . G A,C . PASS IB0 . . .
1 2 . A G,G . PASS II1=126 . . .
1 3 . A G,G . PASS II1=. . . .
1 4 . T A,C . PASS II2=459,-140 . . .
1 5 . T A,C . PASS II2=.,-140 . . .
1 6 . T A,C . PASS II2=459,. . . .
1 7 . T A,C . PASS II2=.,. . . .
However, it seems that bcftools
at least does make this distinction, and losslessly roundtrips this VCF through BCF.
My suggestion here is that we just edit the all_fields.vcf
file to remove all-missing values. This seems like a pretty niche problem, and probably something we'd need to deal with explicitly at the spec level rather than here. It's not worth getting bogged down on, I think.