# Methods for Evolutionary Genomics Analysis

> **NIH NIH R35** · TEMPLE UNIV OF THE COMMONWEALTH · 2021 · $138,656

## Abstract

PROJECT SUMMARY
This administrative supplement request aims to develop a cloud-enabled, highly scalable version of the
computational core of the Molecular Evolutionary Genetics Analysis software (MEGA-CC:
www.megasoftware.net). The development of MEGA-CC is a significant component of the NIH-funded
research project to develop machine learning methods and tools for comparative analysis of molecular
sequences.
With big advances in genome sequencing, researchers are assembling datasets containing large numbers of
species, strains, genes, and genomic segments. Phylogenomic analyses of these data are essential to
understanding the dynamics of evolutionary change of pathogens, humans, and species across the tree of life.
Machine learning methods and software tools for phylogenomics are now necessary because the expanding
size of phylogenomic datasets limits the practical utility of currently available methods and tools due to
excessive computational time and memory requirements. One component of the funded grant is implementing
our new machine learning methods in the MEGA software suite (www.megasoftware.net), an extremely
popular bioinformatics software (>20,000 peer-reviewed citations and 350,000 software downloads in the year
2020 alone). The MEGA software includes a large repertoire of tools for assembling sequence alignments,
inferring evolutionary trees, estimating genetic distances and diversities, inferring ancestral sequences,
computing timetrees, and testing selection. These analyses are now required in all research investigations and
fields in which multiple DNA or RNA sequences are used.
However, MEGA and its computational core (MEGA-CC) are not optimized for distribution and execution on
cloud infrastructure and high-performance computing clusters. This supplement to the funded grant will enable
us to advance MEGA for cloud readiness to harness the scalability, elastic computing power, and easy
software upgrade and maintenance enabled by cloud infrastructure (MEGA-CR). It will also make MEGA
interoperable with existing and future cloud infrastructure. Additionally, this supplement will facilitate using the
new machine learning methods in MEGA with big genomic data in practice, thus addressing an imminent and
fast-growing need for an increasingly larger community of researchers using MEGA. MEGA-CR will increase
the usability of MEGA for the scientific community analyzing very large datasets for which greater accessibility,
cost-efficiency, and scalability of cloud-readiness is becoming crucial.

## Key facts

- **NIH application ID:** 10405153
- **Project number:** 3R35GM139540-01S1
- **Recipient organization:** TEMPLE UNIV OF THE COMMONWEALTH
- **Principal Investigator:** Sudhir Kumar
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $138,656
- **Award type:** 3
- **Project period:** 2021-02-01 → 2026-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10405153

## Citation

> US National Institutes of Health, RePORTER application 10405153, Methods for Evolutionary Genomics Analysis (3R35GM139540-01S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10405153. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
