Learning about the evolution of structural variations from genomic and transcriptomic data

NIH RePORTER · NIH · R35 · $373,597 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Structural variations are key drivers of both evolutionary adaptation and human disease. My group develops and applies computational and statistical approaches for understanding the evolution of structural variations from patterns in their genomic and transcriptomic data. During the past few years, our studies have focused primarily on gene duplication, which represents the most common type of structural variation observed in nature. In particular, we investigated the origins of evolutionary innovation after gene duplication, a problem of long- standing interest in the evolutionary genomics community. To answer this question, we designed the first method for classifying evolutionary outcomes of duplicate genes from phylogenetic comparisons of their gene expression profiles. By applying this decision tree method to multi-tissue gene expression data, we were able to classify evolutionary outcomes of duplicate genes in Drosophila, mammals, and grasses. These studies revealed frequent tissue-specific expression divergence after duplication, as well as sequence and expression differences within and among taxa that are consistent with natural selection. In a follow-up population-genomic analysis, we demonstrated that natural selection indeed plays an important role in the evolutionary outcomes of young duplicate genes in Drosophila. Later, we developed analogous decision tree classifiers for two additional types of structural variations: gene deletion and translocation. Applications of our methods to sequence and expression data from multiple tissues and developmental stages in Drosophila uncovered rapid divergence concordant with adaptation, suggesting that natural selection shapes the evolutionary trajectories of structural variations generated by deletion and translocation as well. However, our recent analyses revealed that there are many limitations of these decision tree methods, including sensitivity to gene expression stochasticity, lack of statistical support, and inability to predict parameters driving the evolution of structural variations. Thus, during the next five years, my group will develop a suite of tailored model-based statistical and machine learning approaches for classifying the evolutionary outcomes and predicting the evolutionary parameters of structural variations arising from duplication, deletion, inversion, and translocation events. Our preliminary studies indicate that these techniques will be much more powerful and accurate than previous approaches, and will therefore compose major advancements in evolutionary investigations of structural variations. In addition to implementing our methods in open source software packages, we will apply them to assay the evolutionary implications of different types of structural variations in humans and several other animal and plant taxa. Comparisons will be made among different types of structural variations, their evolutionary outcomes, and taxonomic groups. The major goa...

Key facts

NIH application ID
10458725
Project number
5R35GM142438-02
Recipient
FLORIDA ATLANTIC UNIVERSITY
Principal Investigator
Raquel Assis
Activity code
R35
Funding institute
NIH
Fiscal year
2022
Award amount
$373,597
Award type
5
Project period
2021-08-01 → 2026-06-30