Statistical Methods and Algorithms for Population Genomic Inference

NIH RePORTER · NIH · R01 · $206,764 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract The proposed supplement is in response to the “Notice of Special Interest (NOSI) regarding the Availability of Urgent Competitive Revisions for Research on the 2019 Novel Coronavirus (2019-nCoV).” In particular, it is in response to the stated NIGMS Interests: “Incorporation of data related to the 2019-nCoV into ongoing research efforts to develop predictive models for the spread of coronaviruses and related infectious agents.” The proposed research extends several aims of the existing grant NIH 5 R01 GM123306 to develop predictive models of the spread of Coronavirus and link these models to genomic variation in hCoV19. Specifically, the proposed research extends Aims 1 and 2 of the original proposal to develop new methods to infer admixture events and recombination among lineages of hCoV19. Recombination is an important factor in the evolution of Coronaviruses and is linked to changes of virulence and transmissibility and is therefore an important parameter of predictive models. The proposed research also extends Aim 4 of the current award by: (1) implementing “tip-dating” using viral sampling times to calibrate divergence-time estimates under a relaxed-molecular clock model, and; (2) implements realistic epidemiological priors for gene trees. These extensions allow important epidemiological parameters, such as R0, to be inferred from genomic sequence data while allowing for temporal changes in contacts between infected and susceptible individuals and changes in sampling (intensity of genetic testing) over time across geographic areas. Because CoV19 has a relatively low mutation rate and is under strong purifying selection, it is important to develop integrative methods for analyzing the genetic data that incorporate information from other sources (travel histories, testing regimes, social distancing measures, etc) and maximize the utility of the sequence data. Implementing such an integrative approach is straightforward using the Bayesian framework proposed. The new priors we implement will allow allow external sources of information to be incorporated. The parameters estimated in the preceding aims are essential for constructing predictive simulations of hCoV19. The proposed research extends Aim 6 of the original proposal to develop new simulation methods for predicting the progress of the pandemic from a molecular-genetics perspective. We will develop and implement these methods in open-source software for jointly predicting both the spread of the COVID-19 pandemic through time and the changes in the genomic variation of SARS-CoV-2 under different mitigation strategies. These simulations will accommodate the selective constraints on the genome inferred from phylogenetic analyses of related Coronoaviruses. Genetic variation is important both for understanding the potential power of different genetic sampling strategies for analyzing the progress of the pandemic and for predicting the likelihood that adaptive changes ma...

Key facts

NIH application ID
10135748
Project number
3R01GM123306-01A1S1
Recipient
UNIVERSITY OF CALIFORNIA AT DAVIS
Principal Investigator
Bruce RANNALA
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$206,764
Award type
3
Project period
2020-02-01 → 2024-01-31