# Apollo - Universal Infrastructure for Genome Curation

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA BERKELEY · 2021 · $351,173

## Abstract

PROJECT SUMMARY
The cost of sequencing a genome has been dramatically reduced in the last decade, and the natural
consequence is that an ever-growing number of researchers are sequencing more and more new genomes,
both within populations and across species. Each of these researchers are collecting genomic information on a
regular basis, but their ability to collaborate and share information and expertise cab be limited by the absence
of supporting tools. To address this need, we have developed Apollo, an easy to use web-based environment
that empowers distributed researchers to interactively explore and refine accurate genomic structural
annotations via informative visualizations. We now propose to extend this tool with the goal fully embedding
genomic ‘crowdsourcing’ into the research lifecycle, upscaling the volume and utility of data that can be
processed by the system. We will incorporate more tools for profiling annotations (e.g. protein motifs, multiple
sequence alignments, protein family placement, and inferred function) by integrating Apollo with the external
tools and services, such as Galaxy, InterPro, and track hubs; and will provide a broader range of automatic
checks and quality measures prior to submission. Apollo will serve both as an editing environment and as a
collaborative communications center, which will dramatically increase the amount of biological information that
can be used in analysis of genome-scale human datasets generating additional insights into human disease
risk, progression and potential therapies. Critically, all contributions from scientists working in this collaborative
research environment will be individually recognized (using ORCIDs) to assure due credit is given and that the
provenance of each annotated genomic feature is available.
To realize this vision, we have outlined a series of specific aims, supported by detailed technical plans. We will
implement a more streamlined, scalable setup procedure to ease installation and deployment for newly
sequenced organisms. We will provide a standardized API providing a platform for the integration of new
capabilities and workflows tailored to individual and community needs. We will develop a stand-alone validation
package to expedite merging revised gene sets with prior versions, as well as real-time quality control during
the annotation process. We will implement an annotation-by-annotation, messaging system to enable curators
to explain their decisions and discuss their reasoning with others. We will implement a provenance system to
provide scientific credit to contributing researchers. We will introduce support for the co-curation of multiple
related genomes and make better use of evolutionary information from homology searches, to improve
annotation consistency and save curator time. We will enable Apollo to function both as a ‘Track Server’, to
dynamically share new annotations via the Ensembl browser (or others), and reciprocally to act as a ‘Track
Client’,...

## Key facts

- **NIH application ID:** 10176512
- **Project number:** 5R01GM080203-14
- **Recipient organization:** UNIVERSITY OF CALIFORNIA BERKELEY
- **Principal Investigator:** Ian H Holmes
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $351,173
- **Award type:** 5
- **Project period:** 2007-08-01 → 2022-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10176512

## Citation

> US National Institutes of Health, RePORTER application 10176512, Apollo - Universal Infrastructure for Genome Curation (5R01GM080203-14). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10176512. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
