Alliance Central: A platform for sustainable development of next generation genome knowledgebases

NIH RePORTER · NIH · U24 · $5,241,011 · view on reporter.nih.gov ↗

Abstract

SUMMARY Model organisms are essential experimental systems for investigating and defining protein and genetic networks, discovering new gene functions, and uncovering the functional consequences of human genome variation. The Alliance of Genome Resources (aka, the Alliance) is a consortium of seven model organism databases (MODs; Drosophila, C. elegans, budding yeast, zebrafish, laboratory mouse, laboratory rat, Xenopus) and the Gene Ontology Consortium (GOC) with a shared mission of facilitating use of biological insights from model organisms to understand the genetic and genomic basis of human health and disease. The Alliance seeks to serve a diverse community of biomedical researchers including basic scientists, clinicians, and data scientists. The Alliance is organized as two interdependent units: Alliance Central and Alliance Knowledge Centers. Alliance Central serves as a software platform developed using modular infrastructure and common data models to and for the coordination of data harmonization and data modeling activities across the Knowledge Centers. Alliance Knowledge Centers including MODs and the GOC are responsible for expert curation and for submission of annotations to Alliance Central using community standards for knowledge representation. Alliance Central represents a next generation extensible software platform for knowledgebases capable of adapting to the rapidly changing data landscape and conforming to modern standards for data management. Alliance Central provides the biomedical research community with unprecedented support for comparative genomics via unified user interfaces and APIs for common data types and promotes sustainability and operational efficiencies of core biodata resources. This U24 application describes the plans for the enhancement and management of Alliance Central building on the significant accomplishments of the Alliance consortium since it was launched in 2016. The focus for Specific Aim 1 will be on expanding the Alliance Central infrastructure for ingesting, storing, and accessing the harmonizing biological annotations from contributing Knowledge Centers. We will continue software development practices that reflect our long-standing commitment to data management practices that align with FAIR (Findable, Accessible, Interoperable, Reusable) principles. In Specific Aim 2 we describe our plans to continue development of a state of the art literature curation system that can be adapted for use by a wide range of biomedical model organism databases. The deliverables for this aim will include the incorporation of machine learning, natural language processing, and artificial intelligence designed to enhance scalability and efficiency of expert curation. In Specific Aim 3 we describe our plans for implementation and/or adoption of user interfaces that advance the mission of the Alliance to facilitate comparative genomics to gain insights into the function of the human genome. Finally, in Specific Aim 4, w...

Key facts

NIH application ID
10976322
Project number
2U24HG010859-06
Recipient
CALIFORNIA INSTITUTE OF TECHNOLOGY
Principal Investigator
CAROL J BULT
Activity code
U24
Funding institute
NIH
Fiscal year
2024
Award amount
$5,241,011
Award type
2
Project period
2019-09-18 → 2029-06-30