Skip to content

Implement gVCF merging logic (SNPs only) #80

@arostamianfar

Description

@arostamianfar

Currently, the only supported merge strategy is MOVE_TO_CALLS, which only merges samples that have the exact reference_name:start_position:end_position:reference_bases:alternate_bases.

For gVCF files, we like to add 0/0 to samples that don't have a variant. For instance, "sample1" has a non-variant from position 1-100, "sample2" has a variant 1/0 in position 55. We should output a single row at position 55 for both "sample1" and "sample2" that has 0/0 and 1/0 for "sample1" and "sample2", respectively.

We currently have a merging logic for gVCF files (merge_with_non_variants_strategy.py). It can handle non-variants, but needs to be cleaned up and tested further. It also depends on a fix in the intervaltree library (see chaimleib/intervaltree#63).

This task is only for supporting merging SNPs with non-variants.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions