# Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA LOS ANGELES · 2021 · $478,330

## Abstract

PROJECT SUMMARY/ABSTRACT
This proposal targets the design, development and distribution of Bayesian statistical methods and software
to study the historical and real-time emergence of rapidly evolving pathogens, such as Ebola, human immun-
odeﬁciency, inﬂuenza, Lassa, SARS-CoV-2, West Nile, yellow fever and Zika viruses. The proposal exploits
novel scalable data integration to equip us for large-scale epidemics and pandemics and help inform action-
able public health policy. Our multidisciplinary team carries expertise across statistical thinking, data science,
evolutionary biology and infectious diseases to leverage advancing sequencing technology and high-throughput
biological experimentation that can characterize 1000s of pathogen genomes, phenotype measurements, eco-
logical and clinical information from a single outbreak. Our chief innovations are three-fold. First, we will invent
and implement scalable Bayesian phylodynamic techniques to integrate phenotypic measurements and study
their correlated evolution with disease spread. Second, we will foster biologically-rich evolutionary models to
map and understand heterogeneity in disease evolution through new efﬁcient algorithms. Third, we will develop
high-dimensional and mixed-type phenotype models to link concerted viral genotype / phenotype changes using
massively parallel computing. Although no competing software exists to integrate phenotype and sequence data
at this scale, we will compare restricted cases of our models with reduced datasets to current state-of-the-art
approaches to evaluate computational performance improvement and bias that these limitations inject using real-
world examples. This proposal will deliver low-level toolbox libraries and user-friendly software for deployment
across a rapidly expanding range of large-scale problems in statistics and medicine.

## Key facts

- **NIH application ID:** 10177121
- **Project number:** 1R01AI153044-01A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA LOS ANGELES
- **Principal Investigator:** Marc A. Suchard
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $478,330
- **Award type:** 1
- **Project period:** 2021-04-09 → 2025-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10177121

## Citation

> US National Institutes of Health, RePORTER application 10177121, Statistical innovation to integrate sequences and phenotypes for scalable phylodynamic inference (1R01AI153044-01A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10177121. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
