Curation

NIH RePORTER · NIH · P41 · $508,683 · view on reporter.nih.gov ↗

Abstract

Summary: Through high quality curation we aim to integrate the wealth of Xenopus data into a single easily accessible computer framework that accelerates research, and facilitates the translation of this large amount of information into meaningful knowledge. The data in Xenbase must be comprehensive, accurate and up to date. Xenbase contains many different data types including; genomes and gene models, DNA, RNA and protein sequences, gene names and symbols, gene expression patterns, gene function, anatomy, orthology, reagents (MOs and antibodies), investigator information, phenotypes and disease associations. This data comes from the published literature, direct community submissions and from other databases (e.g. NCBI, JGI, OMIM). In order for the information in Xenbase to be computer readable, we index and annotate data using ontologies and standardized procedures that are the accepted best practices as shared across major model organism databases (MODs). This allows Xenopus data to be compared to other animals and to human disease gene and phenotypes. The majority of the curation effort is to read and annotate the published Xenopus literature. Xenbase contains a corpus of ~46,000 Xenopus papers, mirroring PubMed, but to date only about 10% of these have been curated, primarily for gene expression patterns. Preliminary analysis indicate that ~11,000 uncurated papers are likely to contain valuable high priority data and about 1,300 new Xenopus papers are published each year. In the Curation Component we will annotate this Xenopus research data.

Key facts

NIH application ID: 9832152
Project number: 5P41HD064556-10
Recipient: CINCINNATI CHILDRENS HOSP MED CTR
Principal Investigator: Aaron M Zorn
Activity code: P41
Funding institute: NIH
Fiscal year: 2020
Award amount: $508,683
Award type: 5
Project period: — → 2021-04-30