# Methods for Evolutionary Genomics Analysis

> **NIH NIH R35** · TEMPLE UNIV OF THE COMMONWEALTH · 2021 · $396,250

## Abstract

Summary/Abstract
Continuing advances in nucleotide sequencing have resulted in the assembly of datasets containing large
numbers of species, genes, and genomic segments. Phylogenomic analyses of these data are essential to
progress in understanding evolutionary patterns across the tree of life, and are finding increasing numbers of
applications in practical analyses that require understanding of how patterns change over time. The sheer size
of phylogenomic datasets limits the practical utility of available methods due to excessive time and memory
requirements. We have developed many high impact methods and tools for comparative analysis of molecular
sequences, a tradition we propose to continue through this MIRA project by developing innovative methods that
address new challenges in phylogenomics. We will focus on pattern-based approaches of machine learning with
sparsity constraint (SL) applied to phylogenomics, as a complement to traditional model-based methods in
molecular evolution and phylogenetics. In the proposed SL in Phylogenomics (SLiP) framework, we will build
models that best explain the biological trait or evolutionary hypothesis of interest, with genomic loci, such as
genes, proteins, and genomic segments, serving as model parameters. Preliminary results from two example
applications establish the premise and promise of a general SLiP framework. In one, SLiP successfully detected
loci whose inclusion in a phylogenomic dataset overtakes a consistent and contrasting signal from hundreds of
other loci when inferring phylogenetic relationships. In the other example, SLiP revealed loci and biological
functional categories that harbor convergent sequence evolutionary patterns associated with the emergence of
the same trait in distinct evolutionary lineages. In all of these analyses, SLiP required only a small fraction of the
computational time and memory demanded by traditional methods, and it enabled better evolutionary contrasts
with fewer assumptions. Consequently, the successful development of SLiP will improve the feasibility, rigor,
and reproducibility of large-scale data analysis. It will also democratize big data analytics via shortened analysis
time and a relatively small memory footprint, and encourage the development of a new class of methods for
phylogenomic analysis. This framework will be accessed from a free library of SLiP functions, which will be
directly useable via command line and available in a graphical interface through integration with the MEGA
software.

## Key facts

- **NIH application ID:** 10086181
- **Project number:** 1R35GM139540-01
- **Recipient organization:** TEMPLE UNIV OF THE COMMONWEALTH
- **Principal Investigator:** Sudhir Kumar
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $396,250
- **Award type:** 1
- **Project period:** 2021-02-01 → 2026-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10086181

## Citation

> US National Institutes of Health, RePORTER application 10086181, Methods for Evolutionary Genomics Analysis (1R35GM139540-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10086181. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
