Structural Variation analysis of Orofacial Cleft associated genomic regions in African and Asian populations

NIH RePORTER · NIH · R03 · $170,750 · view on reporter.nih.gov ↗

Abstract

Project Summary Cleft lip is the 4th most common birth defect in the U.S. and is known to affect annually one in 800 babies worldwide. The Kids First program aims to uncover the etiology of these diseases and foster data sharing within the pediatric research community. Expert-Driven Small Projects to Strengthen Gabriella Miller Kids First Discovery (RFA-RM-22-006) is intended to “engage experts in a variety of activities that will enhance the utility of childhood cancer and/or structural birth defects genomic datasets generated by the Kids First program and/or associated phenotypic datasets and resources”. In this proposal we specifically propose to analyze in-depth the Kids First curated datasets assembled for the cohort Orofacial Cleft: African and Asian Ancestry (253 Families) currently available through the framework CAVATICA at the Kids First data portal. Currently single nucleotide variation (SNV) analysis in syndromic and non-syndromic OFC has found functional impairments in genes such as IRF6, BMP4, MAPK3, etc. However, considering that Structural Variations (SVs) account for more total base-pair variation in human genomes than SNVs, we argue that this topic is an important and missing component of this Kids First project. Exploring the role of SVs in the manifestation of the OFC phenotype will require a search beyond gene regions, since intergenic SVs can cause impairment to normal enhancers and transcription factors. We will explore three main SV types: deletion, duplication and inversions by looking for common and individual SV alleles that differ from the parents. We will also look at the OFC associated genes along with the related transcription factors, paralogs and associated intergenic regions to fully characterize potentially causative SVs. Using a set of preidentified 39 gene loci including IRF6, we will closely survey the 244 triads and use a healthy human cohort to filter the results (1000 Genome Project Phase 3 study, 2504 samples). This analysis will complement the SNV analysis for this data that has already been completed and will provide additional context for the total genomic landscape. We will disseminate our work to the scientific community and compare our results with previous copy number variation (CNV) literature and share the SV triad workflows we develop to enable a similar analysis on other Gabriella Miller Kids First Pediatric Research Program (Kids First) datasets.

Key facts

NIH application ID
10643334
Project number
1R03DE033083-01
Recipient
UNIVERSITY OF CONNECTICUT STORRS
Principal Investigator
DASHZEVEG BAYARSAIHAN
Activity code
R03
Funding institute
NIH
Fiscal year
2023
Award amount
$170,750
Award type
1
Project period
2023-06-01 → 2025-05-31