# Xenbase - Curation

> **NIH NIH P41** · CINCINNATI CHILDRENS HOSP MED CTR · 2024 · $729,854

## Abstract

Component: CURATION
PROJECT SUMMARY/ABSTRACT
Xenbase’s mandate is to curate the data from Xenopus research and generate the definitive
reference dataset for this key model organism. In order to fulfill this mandate, the data in Xenbase
must be accurately annotated, comprehensive, and up-to-date. Xenbase contains many different
data types (genomes and genes, orthology, RNA, proteins, genomic data (RNA/ChIP-seq), gene
expression, gene function, anatomy, reagents (MOs and antibodies), mutant and transgenic lines,
phenotypes and disease associations) that data come from published literature, direct community
submissions and from other. We curate, annotate and index data types using ontologies (i.e.,
standardized vocabularies), which make the data computer readable and FAIR compliant. This
allows us to inter-relate different data types within Xenbase and to link Xenopus data to humans
and other model organisms. Our literature module contains >52,000 Xenopus research papers.
We have curated ~4,800 of these primarily for gene expression patterns, transgenic lines and
reagents, yet we estimate that about 10,000 additional papers contain valuable data, including
phenotypes and GO, two of our main curation priorities in this renewal. Another major goal is to
curate the metadata associated with Xenopus RNA-seq and ChIP-seq datasets in the NCBI Gene
Expression Omnibus (GEO) so that we can process the data and make it available on Xenbase.
Our plan to support single cell transcriptomics will be a third major curation effort in this renewal.
In order to manage this volume of new data, our curation workflow implements several innovative
semi-automated pipelines, texting mining and machine learning. Our curation goals are to stay
current with new publications, clear the backlog of to-be-curated papers (giving priority to those
with phenotypes/models of human disease), and add additional data types, including diverse omic
data sets.
Aim 1. Continue curation of Xenopus research data.
Aim 2. Curate Xenopus phenotypes and human disease models.
Aim 3. Curate new omics and cell biological data.

## Key facts

- **NIH application ID:** 10839930
- **Project number:** 5P41HD064556-14
- **Recipient organization:** CINCINNATI CHILDRENS HOSP MED CTR
- **Principal Investigator:** Aaron M Zorn
- **Activity code:** P41 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $729,854
- **Award type:** 5
- **Project period:** 2010-06-01 → 2026-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10839930

## Citation

> US National Institutes of Health, RePORTER application 10839930, Xenbase - Curation (5P41HD064556-14). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10839930. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
