# Unifying Templates, Ontologies and Tools to Achieve Effective Annotation of Bioassay Protocols

> **NIH NIH U01** · UNIVERSITY OF MIAMI SCHOOL OF MEDICINE · 2020 · $511,367

## Abstract

Project Summary
Biological assays are the foundation for developing chemical probes and drugs, but new Big Data approaches
– which have revolutionized other areas of biomedical science – have not yet advanced this early step of
biomedical research: analysis of assay data. The obstacle is that scientists specify their assays through text
descriptions written in scientific English, which need to be translated into standardized annotations readable by
computers. This lack of standardized and machine-readable assay descriptions is a major impediment to
manage, find, aggregate, compare, re-use, and learn from the ever-growing corpus of assays (e.g., >1.2
million in PubChem). Thus, there is a critical need for better annotation and curation tools for drug discovery
assays. However, the process to go from a simple text protocol to highly detailed machine-readable semantic
annotations is not trivial. Multiple tools and technologies are required: ontologies or the structured controlled
vocabularies; templates that map specific vocabularies to properties that are to be captured; and software tools
to actually apply these ontologies to a given text. Currently, each of these exists in isolation; yet, a bottleneck
in any one tool or technology, or a gap between the different pieces, disrupts the overall process, resulting in
poor or no annotation of the datasets. Here we propose a project to combine and integrate these three
technologies (which are also the core competencies of the three groups collaborating on this proposal). We
will deliver a novel, comprehensive, user-friendly data annotation and curation system that is highly
interconnected, encompassing the full cycle, and real-world practice, of required tasks and decisions, by all
parties within the `bioassay annotation ecosystem' (researchers performing curation, dedicated curators, IT
specialists, ontology owners, and librarians/repositories). The alliance between academic and commercial
collaborators, who already work together, will greatly benefit the project and minimize execution risk. Our
specific aims are to: (1) Develop a bioassay-specific template editor and templates by adopting the Stanford
(Center for Expanded Data Annotation and Retrieval, CEDAR) data model to the machine learning-based
curation tool BioAssay Express, to exploit the broad functionality of its data structures, tools and interfaces; (2)
Define and create an ontology update process and tool (`OntoloBridge') to support rapid feedback between
curators/users and ontology experts and enable semi-automated incorporation of suggestions for updates to
existing published ontologies; (3) Develop new tools to export annotated data into public repositories such as
PubChem; and (4) Evaluate our solution across diverse audiences (pharma, academia, repositories). The
system will improve bioassay curation efficiency, quality, and effectiveness, enabling scientists to generate
standardized annotations for their experiments to make these dat...

## Key facts

- **NIH application ID:** 9979969
- **Project number:** 5U01LM012630-04
- **Recipient organization:** UNIVERSITY OF MIAMI SCHOOL OF MEDICINE
- **Principal Investigator:** BARRY A BUNIN
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $511,367
- **Award type:** 5
- **Project period:** 2017-08-01 → 2023-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9979969

## Citation

> US National Institutes of Health, RePORTER application 9979969, Unifying Templates, Ontologies and Tools to Achieve Effective Annotation of Bioassay Protocols (5U01LM012630-04). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/9979969. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
