Variant Classification

Variant Classification at Baylor Genetics

MAY 2024

DEFINITION

Variant classification is the process of evaluating the clinical significance of a genetic alteration. These alterations may be single nucleotide variants
(SNV), multi-nucleotide variants (MNV), or structural changes including copy number variants (CNV), large insertions, inversions, or translocations. At
Baylor Genetics, variant classification occurs in two steps. The initial review is conducted by clinical genomic scientists in the Clinical Genomics
Interpretation (CGI) team. They hold advanced degrees in human genetics or a related field and have all received extensive training in variant curation
methodologies. The classification is then approved by our expert team of senior scientists and board-certified laboratory directors who also serve joint
appointments as faculty members of the Baylor College of Medicine.

GOAL

The goal of variant classification is to determine the impact that a given variant has, or does not have, on a given phenotype. Baylor Genetics uses
American College of Medical Genetics and Genomics’ (ACMG) five-tier terminology:

  1. Pathogenic (P)
  2. Likely Pathogenic (LP)
  3. Variant of Uncertain Significance (VUS)
    1. For internal classification purposes, Baylor Genetics further divides VUS into three sub-tiers: VUS favoring pathogenic (VUSP), VUS neutral (VUSN), and VUS favoring benign (VUSB)
  4. Likely Benign (LB)
  5. Benign (B)
STRATEGY

The core strategy for any variant classification is:

IDENTIFY THE VARIANT

UNDERSTAND THE GENE-DISEASE CORRELATION

EVALUATE THEORETICAL CONCEPTS BASED ON VARIANT TYPE

GATHER AND SUMMARIZE CLINICAL/POPULATION EVIDENCE

INTERPRET EXPERIMENTAL STUDIES

CLASSIFY THE VARIANT AND REPORT THE INTERPRETATION

IDENTIFYING THE VARIANT

Genome build, chromosome, genomic coordinate, and the base change (reference vs. mutation) are required to identify a variant. The gene name and
mRNA transcript are also needed to build the nomenclature. Baylor Genetics follows HUGO Gene Nomenclature Committee (HGNC) gene naming
standards, Human Genome Variation Society (HGVS) variant nomenclature standards, and National Center for Biotechnology Information (NCBI)
transcript numbering. Baylor Genetics uses single letter amino acid codes. If a variant’s identity or zygosity is not certain, the sample may be sent for
confirmation using a secondary methodology.

UNDERSTANDING THE GENE-DISEASE CORRELATION

Variant classification is always done in the context of a phenotype or set of phenotypes, often a disease. The primary references for gene-disease
correlation are Online Mendelian Inheritance in Man (OMIM), ClinGen, and peer-reviewed literature. If multiple diseases linked to one gene share a similar
molecular mechanism, these may be combined into a single disease for the purpose of variant classification. It is also understood that one gene may be
linked to multiple distinct diseases that cannot be combined, and that a variant may be pathogenic for one disease and uncertain for another.

GATHERING AND SUMMARIZING THE CLINICAL/POPULATION EVIDENCE

Baylor Genetics variant classification emphasizes clinical evidence gained from previously reported patients from the literature and public or internal
databases. Literature searches are performed through public search engines and supplemented by Human Gene Mutation Database (HGMD®) (Cardiff
University / Qiagen). A robust internal Baylor Genetics database is queried for each curation. Clinical evidence is reviewed related to patient clinical
presentation and phenotype, genotype (including additional variants), family history, and segregation studies. For severe dominant conditions with
negative family history, it is important to note whether the variant is de novo and absent in gnomAD. For mitochondrial variants, it is important to note
whether it is inherited maternally, segregation of the phenotype within the family, the tissue being tested, heteroplasmy, prevalence in controls, functional
studies, and whether it impacts protein, tRNA, etc. Clinical evidence without sufficient or convincing information will not be considered, to ensure the
quality of the evidence. Curators at Baylor Genetics search ClinVar to gain understanding of how other groups have classified a variant and to gather
additional literature. Population databases including gnomAD are used to find variant frequencies among control groups, and a variant may be
downgraded if the frequencies or control counts are high.

INTERPRETING EXPERIMENTAL STUDIES

While there are many types of functional studies, several key topics are considered:

  • Is the experimental approach sensible?
  • Did authors use an appropriate model organism or environment?
  • Did authors use proper controls?
  • Did authors test the critical function(s) of the gene or protein?
  • Do authors use statistical methods to convey the significance of their findings versus expected ranges?

Ideally, authors state their hypothesis in advance and robustly comment on their conclusions. For many studies, the preference is to see both in vivo and
in vitro measurements. These factors may decide how strongly an experimental study is weighed in the final classification.

EVALUATING THEORETICAL CONCEPTS BASED ON VARIANT EFFECT

All variant types: Regardless of variant type, the impact on critical domains or motifs and the role of alternative mRNA isoforms is considered. Also
considered is the possible splicing impact using multiple splicing algorithms, even if the variant is not a typical splicing variant. The impact of
pseudogenes or other homologous regions is considered for certain genes.

Frameshift and nonsense: If loss-of-function is a confirmed mechanism of disease, frameshift and nonsense variants usually begin classification as
likely pathogenic. Regarding 3’ variants: if the variant is in the last exon or creates a stop codon within 55 nucleotides of the end of the penultimate exon,
it is considered to escape nonsense-mediated decay and would be evaluated in the context of the importance of the truncated region. For frameshift
variants predicted to significantly extend the protein, the significance of similar extension variants is considered. Regarding 5’ variants: likely pathogenic
is favored for early frameshift and nonsense variants, but may also be considered if an alternative start codon in this isoform or another isoform may
compensate without impact to any critical domain or amino acid.

Start loss: Known pathogenic variants affecting the start codon are considered. In addition, upstream and downstream in-frame ATG nucleotide
sequences are evaluated, including in other tissue-specific isoforms, which may compensate without impact to any critical domain or amino acid.

Stop loss: The significance of similar extension variants, length of the extension, any functional studies demonstrating a defect due to extension, and
whether they may structurally interfere with any critical C-terminal regions are considered.

Missense: Various in silico prediction tools, including CADD, REVEL, and others, are considered for interpreting missense changes. Important concepts
include conservation across species (especially primates), the physiochemical difference between reference and mutant amino acid (hydropathy, size, other
properties), hotspots, coldspots, other known pathogenic variants affecting the same codon, and whether this amino acid has any known specific role.

In-frame insertions and deletions: Considerations include if the region is highly repetitive, the clinical significance of similar in-frame changes at
the same location, and if the variant is in an important domain or has any splicing impact. Specifically for deletions, the conservation of the deleted
amino acid(s) and whether there are any known missense or in-frame variants at the same location are important considerations.

Splicing and synonymous: In the absence of published experimental evidence, SpliceAI and other in-silico algorithms are used to predict splicing
changes. Factors including possible nearby splicing compensating sites and predicted branch points are also considered. If the gain or loss of an
acceptor or donor splicing sequence is predicted, and the mechanism of disease is loss of function, additional predictions are made about the possible
consequence on the reading frame and domains, including exon skipping. Lastly, RNA sequencing (RNAseq) studies may be used to facilitate
interpretation of a splicing variant.

Untranslated Regions (UTR): Functional studies that identify critical regions of the 5’ and 3’ UTRs, such as transcription factor binding sites and
other regulatory sequences, are used.

Intragenic CNV: Considerations include whether the reading frame is disrupted, what percentage of the gene is impacted, nonsense-mediated decay,
the presence of important domains or regions, whether the start or stop codon are impacted, and whether there are known pathogenic CNVs of
comparable size and impact. Concepts such as molecular mechanism, haploinsufficiency, and triplosensitivity also help determine the classification.

CLASSIFYING THE VARIANT AND REPORTING THE INTERPRETATION

Baylor Genetics developed a variant classification protocol with a points-based system, building on the ACMG guidelines. A 5-point system is employed,
where +5 points are required for a pathogenic classification, and -5 points are required for a benign classification. Points are assigned for various
theoretical, clinical, population, and functional concepts in multiples of 0.5. This system considers many more scenarios than those listed in the ACMG
guidelines, while emphasizing additive clinical evidence.

RE-EVALUATING CLASSIFICATIONS

Baylor Genetics has multiple avenues for reclassification of variants based on emerging evidence. Clinicians may contact the Baylor Genetics Client
Support team to request additional information about the classification of a variant or request re-evaluation of classification.

REFERENCES

McCormick EM, Lott MT, Dulik MC, Shen L, Attimonelli M, Vitale O, Karaa A, Bai R, Pineda-Alvarez DE, Singh LN, Stanley CM, Wong S, Bhardwaj A, Merkurjev D, Mao R, Sondheimer N, Zhang S, Procaccio V, Wallace DC, Gai X, Falk MJ. Specifications of the ACMG/AMP standards and guidelines for mitochondrial DNA variant interpretation. Hum Mutat. 2020 Dec;41(12):2028-2057. doi: 10.1002/humu.24107. Epub 2020 Nov 10. PMID: 32906214; PMCID: PMC7717623.

Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E, Voelkerding K, Rehm HL; ACMG Laboratory Quality Assurance Committee. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015 May;17(5):405-24. doi: 10.1038/gim.2015.30. Epub 2015 Mar 5. PMID: 25741868; PMCID: PMC4544753.

Riggs ER, Andersen EF, Cherry AM, Kantarci S, Kearney H, Patel A, Raca G, Ritter DI, South ST, Thorland EC, Pineda-Alvarez D, Aradhya S, Martin CL. Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen). Genet Med. 2020 Feb;22(2):245-257. doi: 10.1038/s41436-019-0686-8. Epub 2019 Nov 6. Erratum in: Genet Med. 2021 Nov;23(11):2230. PMID: 31690835; PMCID: PMC7313390.

Walker LC, Hoya M, Wiggins GAR, Lindy A, Vincent LM, Parsons MT, Canson DM, Bis-Brewer D, Cass A, Tchourbanov A, Zimmermann H, Byrne AB, Pesaran T, Karam R, Harrison SM, Spurdle AB; ClinGen Sequence Variant Interpretation Working Group. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup. Am J Hum Genet. 2023 Jul 6;110(7):1046-1067. doi: 10.1016/j.ajhg.2023.06.002. Epub 2023 Jun 22. PMID: 37352859.