Longevity & AgingResearch PaperOpen Access

GREGoR Consortium Advances Rare Disease Diagnosis Through Genomic Innovation

Major NIH initiative develops new genomic tools and shares data from 7,500+ individuals to accelerate rare disease diagnosis.

Thursday, April 9, 2026 0 views
Published in Nature
a modern genomics laboratory with multiple DNA sequencing machines, computer workstations displaying colorful genetic data visualizations, and researchers in lab coats analyzing genomic data on large monitors

Summary

The GREGoR Consortium represents a major NIH initiative to tackle the challenge that over half of suspected rare disease patients lack genetic diagnoses. This collaborative effort involves five research sites studying over 7,500 individuals from 3,000+ families using advanced genomic technologies. The consortium has contributed to 83 research papers identifying molecular diagnoses in 365 genes, with over one-third being novel disease gene discoveries. By developing new computational tools, improving structural variant detection, and rapidly sharing all data through the AnVIL platform, GREGoR aims to accelerate the adoption of emerging genomic technologies in clinical practice and catalyze global rare disease research efforts.

Detailed Summary

Rare diseases collectively affect approximately 1 in 20 people worldwide, yet more than half of individuals suspected of having a rare genetic condition remain undiagnosed despite advances in genomic medicine. The Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) Consortium was launched by the National Human Genome Research Institute in 2021 to address this critical gap.

The consortium involves five primary research sites and a data-coordinating center, studying over 7,500 individuals from more than 3,000 families. Most of these families had undergone previous clinical genetic testing but remained unsolved, with the majority being exome-negative cases. GREGoR has contributed to 83 research papers studying molecular diagnoses in 365 genes, with more than one-third representing novel disease gene discoveries or phenotypic expansions.

Key technological advances include new computational approaches to extract more diagnostic value from existing exome data. The consortium developed highly accurate methods for inferring genetic phase and calculated pairwise phasing estimates for rare coding variants through the Genome Aggregation Database. They also created tools to identify hundreds of pathogenic structural variant diagnoses from previously unsolved exomes. In a large-scale study of 822 families, short-read genome sequencing resulted in 218 new diagnoses, representing an 8% increase in diagnostic yield compared to exome sequencing alone.

The consortium's impact extends beyond individual diagnoses through its commitment to rapid data sharing. All generated data are made available to researchers worldwide through the Analysis, Visualization and Informatics Lab-space (AnVIL), creating a foundational resource for global rare disease research. This approach aims to accelerate the development of new diagnostic approaches and foster collaborative discovery efforts across the international research community.

GREGoR's work demonstrates the potential for continued innovation in genomic reanalysis and establishes a framework for translating emerging technologies into routine clinical practice, ultimately working toward the goal of providing genetic diagnoses for the millions of individuals worldwide affected by rare diseases.

Key Findings

  • Contributed to 83 research papers identifying molecular diagnoses in 365 genes, with over one-third being novel disease gene discoveries
  • Generated genomic data from over 7,500 individuals across more than 3,000 families with suspected rare diseases
  • Short-read genome sequencing achieved 8% higher diagnostic yield compared to exome sequencing alone in 822 families studied
  • Identified 218 new diagnoses through genome sequencing, with 28% involving variants not detectable by standard exome sequencing
  • Developed tools that identified hundreds of pathogenic structural variant diagnoses from previously unsolved exomes
  • Created highly accurate phasing methods and released pairwise phasing estimates for rare coding variants through gnomAD
  • Most study families had undergone previous clinical genetic testing but remained exome-negative and undiagnosed

Methodology

This is a collaborative consortium study involving five primary research sites and a data-coordinating center. The analysis included over 7,500 individuals from more than 3,000 families with suspected rare diseases, most of whom had previously undergone clinical genetic testing without receiving a diagnosis. The consortium employed multiple genomic approaches including exome reanalysis, short-read genome sequencing, structural variant detection, and computational phasing methods. All data are made publicly available through the AnVIL platform to enable global research collaboration.

Study Limitations

The study focuses primarily on families who remained undiagnosed after conventional clinical genetic testing, which may represent a particularly challenging subset of rare disease cases. The consortium acknowledges that most diagnoses from genome sequencing still involve protein-coding genes, suggesting continued challenges in interpreting non-coding variants. The paper does not provide detailed cost-effectiveness analyses comparing different sequencing approaches, and the long-term clinical outcomes for newly diagnosed patients are not reported.

Enjoyed this summary?

Get the latest longevity research delivered to your inbox every week.