# Developing the Apollo software for high-throughput annotation of multiple genomes

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA BERKELEY · 2023 · $461,731

## Abstract

Genome sequencing is becoming cheaper and more powerful. However, a bottleneck to
scientific progress using these data is the generation of high-quality genome
annotations, describing the location and function of the genes in genomic DNA. In the
past, these annotations were generated exclusively by professional biocurators, and
these biocurators continue to be a critical source of expertise. However, the number of
such biocurators is small compared to the amount of genomic DNA that is available. In
recent years, professional biocuration has been supplemented by a growing volunteer
force of interested biologists, each generally interested in one or two genes of interest.
Although genome annotation is often done in bulk using computational tools, there is
currently no substitute for a final pass of manual annotation. This is most commonly
done using our Apollo software, which allows for live simultaneous collaborative
annotation over the web. Our proposal here is to improve Apollo so as to empower both
professional biocurators and crowdsourcing volunteers. We will empower professional
biocurators by giving them the power tools they need to simultaneously annotate
multiple genomes (by exploiting synteny), annotate variants, and annotate the function
of genes. We will empower the crowdsourcing volunteers by lowering barriers to entry,
making Apollo more usable. For all users we will train machine learning systems to
automatically detect common annotation errors and suggest improvements. We will
support Apollo users with maintenance, bugfixing, and various feature requests, as well
as extensive outreach including outreach to developers.

## Key facts

- **NIH application ID:** 10736567
- **Project number:** 2R01GM080203-15
- **Recipient organization:** UNIVERSITY OF CALIFORNIA BERKELEY
- **Principal Investigator:** Ian H Holmes
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $461,731
- **Award type:** 2
- **Project period:** 2007-08-01 → 2027-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10736567

## Citation

> US National Institutes of Health, RePORTER application 10736567, Developing the Apollo software for high-throughput annotation of multiple genomes (2R01GM080203-15). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10736567. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
