# Protege: A Knowledge-Engineering Environment for Advancing Biomedical Sciences

> **NIH NIH R01** · STANFORD UNIVERSITY · 2020 · $559,088

## Abstract

Project Abstract
The engineering of ontologies that define the entities in an application area and the relationships among
them has become essential for modern work in biomedicine. Ontologies help both humans and
computers to manage burgeoning numbers of data. The need to annotate, retrieve, and integrate high-
throughput data sets, to process natural language, and to build systems for decision support has set many
communities of investigators to work building large ontologies.
The Protégé system has become an indispensable open-source resource for an enormous international
community of scientists—supporting the development, maintenance, and use of ontologies and electronic
knowledge bases by biomedical investigators everywhere. The number of registered Protégé users has
grown from 3,500 in 2002 to more than 300,000 users as of this writing. The widespread use of
ontologies in biomedicine and the availability of tools, such as Protégé, have taken the biomedical field forward
to a new set of challenges that current technology has not been designed to address: Biomedical ontologies
have grown in size and scope, and their creation, maintenance and quality assurance have become particularly
effort-intensive and error-prone. In this proposal, we will develop new methods and tools that will significantly
aid biomedical researchers in easily creating and testing biomedical ontologies throughout their lifecycle.
Our plan entails four specific aims. First, we will develop methods and tools to allow biomedical scientist to
easily create ontologies directly from their source documents, such
as spreadsheets, tab indented hierarchies,
and document outlines. Second, we will provide the methods and tools to allow biomedical scientist to identify
potential “hot spots” in their ontologies that might affect their quality. Third, we will implement
a
comprehensive, automated testing framework for ontologies that will assist biomedical researchers in
performing ontology and data quality assurance throughout the development cycle. Fourth, we will continue to
expand and support the thriving Protégé user community, as it grows to include new clinicians and biomedical
scientists as they build the ontologies needed to support clinical care, data-driven research, and the elucidation
of new discoveries.

## Key facts

- **NIH application ID:** 9848600
- **Project number:** 5R01GM121724-04
- **Recipient organization:** STANFORD UNIVERSITY
- **Principal Investigator:** Mark A Musen
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $559,088
- **Award type:** 5
- **Project period:** 2017-01-01 → 2021-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9848600

## Citation

> US National Institutes of Health, RePORTER application 9848600, Protege: A Knowledge-Engineering Environment for Advancing Biomedical Sciences (5R01GM121724-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9848600. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
