# Integrate cancer genomics data in genetic  studies and diagnosis of developmental disorders

> **NIH NIH R01** · COLUMBIA UNIVERSITY HEALTH SCIENCES · 2021 · $333,146

## Abstract

Project Summary
 We aim to develop novel computational approaches to improve detection of risk genes and prediction
of functional effects of germline mutations in patients with developmental disorders by integrating somatic
cancer mutation and functional genomic data.
 Developmental disorders (DD), including neurodevelopmental disorders (NDD) and structural birth
defects, affect ~5% of all newborns and have a significant impact on families and society. In the past few
years, large-scale family-based sequencing studies on DD, such as autism and congenital heart disease, have
identified a large number of de novo variants potentially implicated in disease. Unlike many other pediatric
Mendelian diseases, genetic diagnosis of DD by genome or exome sequencing is more challenging because:
(a) the complete catalog of DD genes (likely ~1,000) is not yet available; (b) observed variants are often
difficult to interpret due to lack of rapid and cost-effective functional assays. Therefore, improved ability to
identify novel risk genes and predict the functional effects of missense variants would significantly improve
our ability to diagnose DD and develop targeted therapeutic approaches. Cancer is driven by dysregulation of
core cellular processes that are also important to DD, such as proliferation, growth, and differentiation. There
are well known genes implicated in both cancer and DD with somatic driver mutations in cancer and highly-
penetrant germline de novo variants in DD. We analyzed data from recent large-scale genomic studies of
cancer and DD, and found a large number of genes potentially implicated in both diseases, and many of them
have similar molecular modes of action across conditions. This indicates that patterns of cancer somatic
mutations can provide valuable insights to improve our ability to identify causal variants and genes in patients
with DD.
 To that end, we have these specific aims: Specific Aim 1. Elucidate common genes and variants
disrupted in cancer and DD based on somatic mutations in cancer and germline de novo mutations in DD.
Specific Aim 2. Infer dosage sensitive genes by integrating mutation data in cancer and developmental
disorders with functional genomic data. Specific Aim 3. Software development and data sharing.
 With the proposed new computational approaches, we will be able to leverage the accumulating
cancer somatic mutation data from international cancer precision medicine efforts. In this framework, tumor
samples will be natural “laboratories” for large-scale functional assays in cancer driver genes. This strategy
will improve the utility of cross-field genomic data, and allow us to better predict functional effects of
candidate variants (especially missense variants) in genetic diagnosis and identify novel risk genes for
developmental disorders.

## Key facts

- **NIH application ID:** 10166608
- **Project number:** 5R01GM120609-05
- **Recipient organization:** COLUMBIA UNIVERSITY HEALTH SCIENCES
- **Principal Investigator:** Yufeng Shen
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $333,146
- **Award type:** 5
- **Project period:** 2017-08-16 → 2023-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10166608

## Citation

> US National Institutes of Health, RePORTER application 10166608, Integrate cancer genomics data in genetic  studies and diagnosis of developmental disorders (5R01GM120609-05). Retrieved via AI Analytics 2026-05-29 from https://api.ai-analytics.org/grant/nih/10166608. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
