Thrasher Research Fund - Medical research grants to improve the lives of children

Project Details

Early Career

Status: Funded - Closed

Genome Sequencing for Pediatric Rare Disease Diagnosis

Monica Wojcik, MD


Background: Massively parallel (“next generation”) sequencing has revolutionized clinical genetics by enabling the identification of variants responsible for severe, early-onset conditions, particularly via large gene panels or exome sequencing (ES). However, for many individuals with rare disease, the causative variant remains elusive even after these approaches. In these cases, clinicians may turn to genome sequencing (GS), though the added value of this technique and its optimal use remains unclear. Objective: To determine the diagnostic yield of GS within a phenotypically-diverse cohort and to evaluate features enabling successful diagnosis via GS, particularly when previous methods have failed. Methods: Through the Center for Mendelian Genomics at the Broad Institute of MIT and Harvard, we have sequenced and analyzed >8,000 families affected by predominantly congenital or pediatric-onset rare diseases, including 744 for whom GS was performed, typically after a prior negative genetic evaluation. Results: Of the 744 families investigated using GS, a diagnosis was identified in 197 (26.5%). Most diagnoses were in previously-known disease genes (146/197, 74.1%) and the remainder represented novel disease gene discoveries (51/197, 25.8%). Of all diagnoses, 128 (65.0%) had been previously missed by ES. We systematically evaluated these diagnoses for features requiring GS for diagnosis and 60/197 (30.5%) met these criteria (8.1% of the entire genome cohort, 10.9% for the subset of the cohort with prior exome sequencing) including small structural variants (23), copy neutral inversions (2), short tandem repeat expansions (6), deep intronic non-coding variants (11), and coding variants that are more easily found using GS (18). Conclusion: We describe the diagnostic yield of GS in a large and diverse cohort, illustrating several types of cryptic pathogenic variation missed by ES or other techniques, most commonly structural variants. While sequencing, analysis, and storage costs for GS limit its routine application as a first-tier test, these factors are rapidly diminishing. In the meantime, these data guide selection of cases for GS and suggest prioritization of cases where SV analysis is needed and/or when there is a strong clinical suspicion for a condition but targeted testing was negative. Our data further highlight ongoing progress in understanding of pathogenic genomic variation, particularly non-coding variation, and predict increasing diagnostic utility of GS that will further define its optimal implementation.