Regression, Phylogenetics, and Study Design in Infectious Disease Epidemiology

NIH RePORTER · NIH · R01 · $347,898 · view on reporter.nih.gov ↗

Abstract

 DESCRIPTION (provided by applicant): Beginning with John Snow's investigations of cholera epidemics, understanding and preventing infectious disease transmission has been one of the fundamental goals of epidemiology. Whole-genome sequences from viruses and bacteria are a promising new source of information about disease transmission, but current statistical methods are unable to incorporate these data into the analysis of transmission in households and other close-contact groups. The long-term goal is to develop statistical and epidemiologic methods that use high-resolution transmission data and genetic sequence data to inform rapid and effective public health responses to emerging infections. The goal of the proposed research is to develop flexible and robust regression models for infectious disease transmission data that can incorporate pathogen genetic sequences. These will be based on a recently-developed semiparametric regression model that can estimate parameters crucial to mathematical models of epidemics and the design of interventions, including hazard ratios for covariate effects on infectiousness and susceptibility and baseline hazards of transmission in infectious-susceptible pairs. To make it a more practical tool for infectious disease epidemiology, this model will be extended to account for external sources of infection, missing data, and small samples. The partial likelihood for this model is a sum over the set of transmission trees consistent with the epidemiologic data on person, place, and time. Since a phylogeny linking pathogen samples from infected individuals constrains the set of possible transmission trees, pathogen genetic sequence data can be combined with epidemiologic data to obtain more efficient estimates of transmission parameters. Epidemiologic and genetic data will be combined by developing algorithms to find the set of transmission trees simultaneously consistent with both. These algorithms will be incorporated into Markov chain Monte Carlo or sequential Monte Carlo estimation procedures that will account for missing data and phylogenetic uncertainty. These methods will serve as a theoretical basis for the development of efficient case-control and case-cohort study designs for outbreak investigations and vaccine trials. The proposed research is innovative because it synthesizes survival analysis and statistical genetics to analyze infectious disease transmission data. It is significant because it will improve the collection and analysis o data and the evaluation of interventions in epidemics, allowing more effective control of emerging infections.

Key facts

NIH application ID
9843088
Project number
5R01AI116770-05
Recipient
OHIO STATE UNIVERSITY
Principal Investigator
Eben Kenah
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$347,898
Award type
5
Project period
2016-01-01 → 2021-12-31