Computational analysis of whole genome sequence data for discovering novel risk genes of structural birth defects

NIH RePORTER · NIH · R03 · $159,615 · view on reporter.nih.gov ↗

Abstract

Project Summary We aim to improve our understanding of the genetic basis of structural birth defects. To achieve that, we propose to develop and improve computational methods for interpretation of rare variants and perform integrative statistical analysis of both protein-coding and noncoding variants to identify new risk genes. Structural birth defects in aggregation are common in live births. Although the survival rate of patients with severe birth defects has been dramatically improved in recent decades, many survived patients still have significant clinical problems later in life, including growth, neurodevelopmental disorders, childhood cancer, and other health issues. Better understanding of the genetic basis of structural birth defects will lead to new insights into the cause of these clinical issues and will provide targets for medical intervention and treatment. Recent large-scale genomic sequencing studies of birth defects, including projects funded by the Gabriella Miller Kids First (GMKF) program, have identified new risk genes, especially through de novo variants in protein coding regions. However, the genetics of birth defects is complex. By far, known risk genes only explain 5 to 30% of common birth defects such as congenital heart disease. The majority of risk genes are unknown. The contribution to the disease risk from rare inherited variants or noncoding variants is much less known. To investigate these types of variants effectively and identify new risk genes, we need larger sample size and better computational tools that improve the prediction of functional impact of rare variants. In this study, we propose two aims to address these questions by leverage growing GMKF whole genome sequencing (WGS) data sets across cohorts and latest development in machine learning and other genomic data sets: Specific Aim 1. Develop and improve computational methods to prioritize damaging rare missense and noncoding variants in genetic studies. Specific Aim 2. Integrative analysis of rare coding and noncoding variants to identify new risk genes of structural birth defects. Our proposed study will identify new risk genes by combining GMKF WGS data sets with other exome or WGS data of the same birth defects, and in turn improve our understanding of the pleiotropic effects and tissue specificity of risk genes and variants in birth defects. The new computational and statistical tools for interpreting rare variants will be broadly applicable to genetic studies of birth defects and other conditions.

Key facts

NIH application ID
10354418
Project number
1R03HL161595-01
Recipient
COLUMBIA UNIVERSITY HEALTH SCIENCES
Principal Investigator
Yufeng Shen
Activity code
R03
Funding institute
NIH
Fiscal year
2022
Award amount
$159,615
Award type
1
Project period
2022-08-01 → 2024-07-31