Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions browser/help/helpPageTableOfContents.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ const helpPageTableOfContents: { topics: string[]; faq: FaqTopic[] } = {
'hgdp-1kg-annotations',
'v4-hts',
'v4-browser-hts',
'clinvar-hts',
'exome-capture-tech',
'combined-freq-stats',
'allele-count-zero',
Expand Down
103 changes: 103 additions & 0 deletions browser/help/topics/clinvar-hts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
---
id: clinvar-hts
title: 'ClinVar Hail Tables'
---

### Overview

We release two data tables underlying the ClinVar data displayed in the gnomAD browser. On a bi-monthly basis, we process the ClinVar monthly VCV XML release into two Hail Tables, one for GRCh38 and one for GRCh37. These tables enable our users to more easily incorporate ClinVar data into external pipelines in a manner consistent with what they see in the browser.

These tables are stored in a requester pays GCS bucket. As such, to access or download this data you must provide a billing project when accessing the data with `gsutil`, e.g.

```
gsutil -u YOUR_PROJECT -m cp -r \
gs://gnomad-browser-clinvar/gnomad_clinvar_grch38.ht \
gs://YOUR-BUCKET/YOUR-OPTIONAL-NESTED-BUCKETS/gnomad_clinvar_grch38.ht
```

<br />

#### ClinVar GRCh38 Hail Table annotations

Global Fields:

- `clinvar_release_date`: Release date of the ClinVar VCV XML file used to generate this Hail Table.
- `mane_select_version`: MANE Select version used to annotate variants.

Row Fields:

- `locus`: Variant locus. Contains contig and position information.
- `alleles`: Variant alleles.
- `clinvar_variation_id`: Unique ClinVar variant ID.
- `rsid`: dbSNP reference SNP identification (rsID) number.
- `review_status`: ClinVar review status for this variant association.
- `gold_stars`: Number of gold stars assigned to this variant association.
- `clinical_significance`: Clinical significance of this variant association.
- `last_evaluated`: Date when this variant association was last evaluated.
- `submissions`: Array containing all association submissions for this variant.
- `id`: Unique ID of this variant association submission.
- `submitter_name`: Group or individual that submitted this variant association submitter.
- `clinical_significance`: Clinical significance of this variant association submission.
- `last_evaluated`: Date when this variant association submission was last evaluated.
- `review_status`: ClinVar review status for this variant association submission.
- `conditions`: An array containing conditions associated with this variant submission.
- `name`: Name of the condition.
- `medgen_id`: MedGen ID of the condition.
- `variant_id`: gnomAD format variant ID.
- `reference_genome`: Reference genome of this variant.
- `chrom`: Chromosome which this variant is in.
- `pos`: Position of this variant in the chromosome.
- `ref`: Reference allele for this variant.
- `alt`: Alternate allele for this variant.
- `transcript_consequences`: Array containing variant transcript consequence information.
- `biotype`: Transcript biotype.
- `consequence_terms`: Array of predicted functional consequences.
- `domains`: Set containing protein domains affected by variant.
- `gene_id`: Unique ID of gene associated with transcript.
- `gene_symbol`: Symbol of gene associated with transcript.
- `hgvsc`: HGVS coding sequence notation for variant.
- `hgvsp`: HGVS protein notation for variant.
- `is_canonical`: Whether transcript is the canonical transcript.
- `lof_filter`: Variant LoF filters (from [LOFTEE](https://github.com/konradjk/loftee)).
- `lof_flags`: LOFTEE flags.
- `lof`: Variant LOFTEE status (high confidence `HC` or low confidence `LC`).
- `major_consequence`: Primary consequence associated with transcript.
- `transcript_id`: Unique transcript ID.
- `transcript_version`: Transcript version.
- `polyphen_prediction`: [Score](https://www.nature.com/articles/nmeth0410-248) that predicts the possible impact of an amino acid substitution on the structure and function of a human protein, ranging from 0.0 (tolerated) to 1.0 (deleterious).
- `sift_prediction`: [Score](https://www.nature.com/articles/nprot.2009.86) reflecting the scaled probability of the amino acid substitution being tolerated, ranging from 0 to 1. Scores below 0.05 are predicted to impact protein function.
- `gene_version`: Gene version.
- `is_mane_select`: Whether transcript is the MANE select transcript.
- `is_mane_select_version`: MANE Select version; has a value if this transcript is the MANE select transcript.
- `refseq_id`: RefSeq ID associated with transcript.
- `refseq_version`: RefSeq version.
- `in_gnomad`: Whether or not this variant is in gnomAD.
- `gnomad`: Struct containing variant information from gnomAD.
- `exome`: Struct containing exome information from gnomAD for this variant.
- `filters`: Set containing variant QC filters. See `filters` description on the v4 Hail Tables [help page](v4-hts#filters).
- `ac`: Allele count for this variant in exomes.
- `an`: Allele number for this variant in exomes.
- `genome`: Struct containing genome information from gnomAD for this variant.
- `filters`: Set containing variant QC filters. See `filters` description on the v4 Hail Tables [help page](v4-hts#filters).
- `ac`: Allele count for this variant in genomes.
- `an`: Allele number for this variant in genomes.

<br />

#### ClinVar GRCh37 Hail Table annotations

This table has a nearly identical schema as the ClinVar GRCh38 table, with exceptions noted below, but uses he GRCh37 as its reference genome.

The GRCh37 table does not include these fields:

Global Fields:

- `mane_select_version`

Row Fields:

- Under the `transcript_consequences` array:
- `is_mane_select`
- `is_mane_select_version`
- `refseq_id`
- `refseq_version`
20 changes: 20 additions & 0 deletions browser/src/DataPage/GnomadV2Downloads.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,26 @@ const GnomadV2Downloads = () => {
</FileList>
</DownloadsSection>

<DownloadsSection>
<SectionTitle id="v2-clinvar-grch37">ClinVar</SectionTitle>
<p>
For more information about these files, see the{' '}
<Link to="/help/clinvar-hts">help text</Link>.
</p>

<FileList>
{/* @ts-expect-error TS(2745) FIXME: This JSX tag's 'children' prop expects type 'never... Remove this comment to see the full error message */}
<ListItem>
<GetUrlButtons
gcsBucket="gnomad-browser-clinvar"
label="ClinVar GRCh37 Browser Hail Table"
path="/gnomad_clinvar_grch37.ht"
includeAWS={false}
/>
</ListItem>
</FileList>
</DownloadsSection>

<DownloadsSection>
<SectionTitle id="v2-resources">Resources</SectionTitle>
<FileList>
Expand Down
20 changes: 20 additions & 0 deletions browser/src/DataPage/GnomadV4Downloads.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,26 @@ const GnomadV4Downloads = () => {
</FileList>
</DownloadsSection>

<DownloadsSection>
<SectionTitle id="v4-clinvar-grch38">ClinVar</SectionTitle>
<p>
For more information about these files, see the{' '}
<Link to="/help/clinvar-hts">help text</Link>.
</p>

<FileList>
{/* @ts-expect-error TS(2745) FIXME: This JSX tag's 'children' prop expects type 'never... Remove this comment to see the full error message */}
<ListItem>
<GetUrlButtons
gcsBucket="gnomad-browser-clinvar"
label="ClinVar GRCh38 Browser Hail Table"
path="/gnomad_clinvar_grch38.ht"
includeAWS={false}
/>
</ListItem>
</FileList>
</DownloadsSection>

<DownloadsSection>
<SectionTitle id="v4-resources">Resources</SectionTitle>
<FileList>
Expand Down
122 changes: 122 additions & 0 deletions browser/src/DataPage/__snapshots__/DataPage.spec.tsx.snap
Original file line number Diff line number Diff line change
Expand Up @@ -7077,6 +7077,67 @@ exports[`Data Page has no unexpected changes 1`] = `
</li>
</ul>
</section>
<section
className="c16"
>
<span
className="c8"
>
<h2
className="c17"
>
<a
aria-hidden="true"
className="c10 c11"
href="#v4-clinvar-grch38"
id="v4-clinvar-grch38"
>
<img
alt=""
aria-hidden="true"
height={12}
src="test-file-stub"
width={12}
/>
</a>
ClinVar
</h2>
</span>
<p>
For more information about these files, see the

<a
className="-Link c14"
href="/help/clinvar-hts"
onClick={[Function]}
>
help text
</a>
.
</p>
<ul
className="c20"
>
<li
className="c21"
>
<span>
ClinVar GRCh38 Browser Hail Table
</span>
<br />
Show URL for

<button
aria-label="Show Google URL for ClinVar GRCh38 Browser Hail Table"
className="c22"
onClick={[Function]}
type="button"
>
Google
</button>
</li>
</ul>
</section>
<section
className="c16"
>
Expand Down Expand Up @@ -21127,6 +21188,67 @@ exports[`Data Page has no unexpected changes 1`] = `
</li>
</ul>
</section>
<section
className="c16"
>
<span
className="c8"
>
<h2
className="c17"
>
<a
aria-hidden="true"
className="c10 c11"
href="#v2-clinvar-grch37"
id="v2-clinvar-grch37"
>
<img
alt=""
aria-hidden="true"
height={12}
src="test-file-stub"
width={12}
/>
</a>
ClinVar
</h2>
</span>
<p>
For more information about these files, see the

<a
className="-Link c14"
href="/help/clinvar-hts"
onClick={[Function]}
>
help text
</a>
.
</p>
<ul
className="c20"
>
<li
className="c21"
>
<span>
ClinVar GRCh37 Browser Hail Table
</span>
<br />
Show URL for

<button
aria-label="Show Google URL for ClinVar GRCh37 Browser Hail Table"
className="c22"
onClick={[Function]}
type="button"
>
Google
</button>
</li>
</ul>
</section>
<section
className="c16"
>
Expand Down
20 changes: 20 additions & 0 deletions browser/src/help/__snapshots__/HelpPage.spec.tsx.snap
Original file line number Diff line number Diff line change
Expand Up @@ -771,6 +771,15 @@ exports[`Help Page has no unexpected changes 1`] = `
v4-browser-hts
</a>
</li>
<li>
<a
className="-Link c3"
href="/help/clinvar-hts"
onClick={[Function]}
>
clinvar-hts
</a>
</li>
<li>
<a
className="-Link c3"
Expand Down Expand Up @@ -1089,6 +1098,17 @@ exports[`Help Page has no unexpected changes 1`] = `
v4-browser-hts
</a>
</li>
<li
className="c10"
>
<a
className="-Link c3"
href="/help/clinvar-hts"
onClick={[Function]}
>
clinvar-hts
</a>
</li>
<li
className="c10"
>
Expand Down
20 changes: 20 additions & 0 deletions deploy/docs/UpdateClinvarVariants.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,23 @@
```
curl -u "elastic:$ELASTICSEARCH_PASSWORD" -XPUT 'http://localhost:9200/_snapshot/backups/%3Csnapshot-%7Bnow%7BYYYY.MM.dd.HH.mm%7D%7D%3E'
```

8. Update the public ClinVar buckets

We release these final hail tables in a requester pays bucket, `gs://gnomad-browser-clinvar`, use `gsutil rsync` to keep the files in sync.

GRCh37

```
gsutil -u gnomadev -m rsync -r \
gs://gnomad-v4-data-pipeline/output/clinvar/clinvar_grch37_annotated_2.ht/ \
gs://gnomad-browser-clinvar/gnomad_clinvar_grch37.ht
```

GRCh38

```
gsutil -u gnomadev -m rsync -r \
gs://gnomad-v4-data-pipeline/output/clinvar/clinvar_grch38_annotated_2.ht/ \
gs://gnomad-browser-clinvar/gnomad_clinvar_grch38.ht
```