Analysis of pathogenic tandem repeat variation in Gabriella Miller Kids First pediatric cohorts

NIH RePORTER · NIH · R03 · $169,000 · view on reporter.nih.gov ↗

Abstract

Tandem Repeat Expansions (TREs), most commonly of triplet repeats such as poly(CAG), are now known to underlie >40 different human diseases. While the most well-known TREs occur in late-onset neurodegenerative disorders such as hereditary ataxias and Huntington disease, pathogenic TREs have also been identified in a growing list of congenital disorders. To date, pathogenic expansions of coding or intronic repeats are known to occur in nine different genes that cause congenital disease. However, despite this ample evidence that variation in tandem repeat (TR) sequences can act as the causative mutation in a wide variety of congenital disorders, to our knowledge, there have been no concerted efforts to systematically screen for novel TREs in cohorts of patients with congenital disease. Newly developed bioinformatic approaches that can be applied to analyze Whole Genome Sequencing (WGS) data now provide an opportunity to fill this knowledge gap. Utilizing the expertise and knowledge that we have gained working on other disease cohorts, we now propose to apply these approaches to analyze WGS data from ~3,000 trios generated by Gabriella Miller Kids First Pediatric Research Program. We will use these data to profile TR variation genome-wide using multiple different algorithmic approaches that are optimized to identify the full range of pathogenic TR variations, including relatively subtle expansions (increase of 1-20 TR copies), and much longer TREs (gains of 20 up to several thousand repeat copies), both of which are known to cause congenital disorders. We hypothesize that some cases of AD are caused by rare, highly penetrant pathogenic TREs. Using novel bioinformatic tools that can identify TREs, we will search for rare TREs that are observed only in AD samples, or which show significant enrichment in AD cases compared to controls, and thus are likely causative for AD. Potentially pathogenic TREs will then be validated by PCR or long-read sequencing in available DNA samples. Given that TREs represent an established mutational mechanism that contributes to a variety of congenital disorders, we propose that the study of TR variation represents a logical step that has a high likelihood of uncovering novel genetic causes of congenital disorders.

Key facts

NIH application ID
10492425
Project number
5R03HD103782-02
Recipient
ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI
Principal Investigator
Andrew James Sharp
Activity code
R03
Funding institute
NIH
Fiscal year
2022
Award amount
$169,000
Award type
5
Project period
2021-09-22 → 2023-08-31