# Scalable Inference in Statistical Models of Viral Evolution and Human Health

> **NIH NIH F31** · UNIVERSITY OF CALIFORNIA LOS ANGELES · 2021 · $37,853

## Abstract

Project Summary / Abstract
Despite global public health advances, viruses remain a major threat to human health both in the United
States and internationally. Recent and continuing outbreaks of SARS-CoV-2, Ebola, Zika, Lassa fever, and
Chikungunya, as well as persistent epidemics such as HIV have emphasized the need to understand viral
evolution and virus-host interactions during epidemics. Phylogenetic statistical models of viral evolution offer
a powerful tool for studying the interplay between viral genetics and environmental or host factors. However,
current phylogenetic models are often too inﬂexible to realistically model these relationships, and those that
do are computationally intractable for even moderately sized data sets. This project aims to develop new
statistical models that are both ﬂexible enough to model complex biological relationships and scalable to large
data sets of viral and host traits. The ﬁrst aim is to develop more efﬁcient and less biased statistical methods
for estimating the heritability of viral phenotypes (e.g. viral load, host CD4 T-cell count, replicative capacity).
Current statistical practices typically produced biased heritability estimates and are intractable for large data
sets. This project seeks to extend state-of-the-art inference techniques to model the heritability of viral pheno-
types (enabling both unbiased and efﬁcient inference) and to apply these new methods to better estimate the
heritability of viral load in HIV-1. The second aim seeks to develop statistical methods for studying complex,
high-dimensional viral phenotypes such as infection severity which cannot be captured with a single measure-
ment. These phenotypes are difﬁcult to quantify due to their inherent complexity, confounding rigorous efforts
at, say, identifying unusually virulent viral clades. While phylogenetic factor analysis enables identiﬁcation and
quantiﬁcation of high-dimensional phenotypes, it scales poorly to large data sets. We propose new inference
techniques that address these scalability problems and allow previously intractable analyses. We plan to apply
these new methods to study patterns of virulence in Ebola and Lassa fever and to identify unusually virulent
viral strains. Additionally, these methods are well suited to identifying epistatic interactions between viral mu-
tations and phenotypes of interest, and we plan to explore these interactions in HIV, Zika, and Chikungunya
viruses. The third aim is to develop new statistical models speciﬁcally designed to predict outcomes of viral
infections from viral sequence data. To accommodate the necessary ﬂexibility required by these models, we
develop new inference strategies that are both highly generalizable (i.e. they do not rely on strict assumptions
in existing models) and computationally efﬁcient. Strong predictive performance would enable researchers or
clinicians to predict clinically relevant outcomes using viral sequences, which could help inform treatment. We...

## Key facts

- **NIH application ID:** 10232891
- **Project number:** 1F31AI154824-01A1
- **Recipient organization:** UNIVERSITY OF CALIFORNIA LOS ANGELES
- **Principal Investigator:** Gabriel William Hassler
- **Activity code:** F31 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $37,853
- **Award type:** 1
- **Project period:** 2021-05-01 → 2023-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10232891

## Citation

> US National Institutes of Health, RePORTER application 10232891, Scalable Inference in Statistical Models of Viral Evolution and Human Health (1F31AI154824-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10232891. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
