# Computational analysis of complex genetic interactions

> **NIH NIH R35** · COLD SPRING HARBOR LABORATORY · 2021 · $480,000

## Abstract

Project Summary / Abstract
 How does the DNA sequence of an organism (genotype) determine its form and function (phenotype)?
New technologies such as massively parallel reporter assays (MPRAs), deep mutational scanning, and
combinatorial CRISPR screens have the potential to expose the genotype-phenotype relationship at an
unprecedented level of detail by measuring phenotypes for tens of thousands to millions of genotypes in a
single experiment. However, interpreting the results of these experiments is difficult because the space of
genotypes is intrinsically high-dimensional and combinations of mutations often interact in complicated ways.
My research program is focused on developing new computational tools to analyze data from these high-
throughput experiments, with the goals of (1) identifying the major qualitative features of the genotype-
phenotype relationship in specific biological systems, (2) explaining how these qualitative features arise from
underlying developmental, cell biological and biophysical mechanisms, (3) being able to accurately predict the
phenotypes of unmeasured genotypes, and (4) quantifying the uncertainty in these predictions.
 My primary research objective over the next five years is to develop new computational and statistical
techniques capable of capturing higher-order epistasis, that is, genetic interactions that occur between three or
more mutations. Although contemporary high-throughput mutagenesis experiments reveal that these higher-
order interactions are extremely prevalent, we currently lack general, principled statistical models capable of
modeling such interactions. My research group is currently developing two different, but related, methods for
modeling these interactions. While both methods display state-of-the-art predictive performance on smaller
datasets with tens to hundreds of thousands of genotypes, substantial work remains to adapt these methods to
the scale of the largest available datasets, which contain measurements for millions of genotypes. In the
coming years, we plan to build these methods into an integrated framework for analyzing complex genetic
interactions, complete with quantification of uncertainty, tools for biological interpretation and exploratory data
analysis, and practical software that can be used and interpreted by both computational biologists and
experimentalists.
 High-throughput mutagenesis experiments have the potential to transform molecular biology by
providing a general-purpose tool for interrogating the genotype-phenotype relationship of an arbitrary genetic
element. Important applications include mapping adaptive paths to immune escape and drug resistance
variants in infectious disease, designing improved antibodies and enzymes, and genomic variant interpretation.
Development of the computational tools proposed here will further these goals by providing a principled and
functional framework for understanding the complex genetic interactions revealed in these experi...

## Key facts

- **NIH application ID:** 10234158
- **Project number:** 5R35GM133613-03
- **Recipient organization:** COLD SPRING HARBOR LABORATORY
- **Principal Investigator:** David Martin McCandlish
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $480,000
- **Award type:** 5
- **Project period:** 2019-09-01 → 2024-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10234158

## Citation

> US National Institutes of Health, RePORTER application 10234158, Computational analysis of complex genetic interactions (5R35GM133613-03). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10234158. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
