# Extended Methods and Software Development for Health NLP

> **NIH NIH R01** · BOSTON CHILDREN'S HOSPITAL · 2024 · $448,056

## Abstract

Project Summary
Our program vision is to unravel the information buried in health-related narratives by advancing text-processing
methods in a unified way across all the genres of health texts and distributing them through an advanced NLP
software platform under solid governance and sustainability. The crosscutting theme is the investigation of
methods for health NLP made possible by big data, fused with health knowledge. The underlying theme of this
renewal is the development of methods towards generalizable, efficient and knowledge-rich models in the
context of modern machine learning techniques, particularly models implementing attention mechanisms and
using large unlabeled datasets. There is growing penetration of deep learning approaches in the field of health
natural language processing. Our proposal aims to address critical methodological gaps and understudied areas
in the current unprecedented fast-paced environment. Therefore, our renewal lays out novel and much needed
explorations of health NLP research which we will advance through our specific aims. Our datasets will continue
to span the spectrum of health-related data – Electronic Medical Records clinical narrative, patient-authored on-
line community posts, and health-related social media. The evaluation of the methods we will develop will be
performed on the key clinical tasks of concept extraction, relation extraction, and phenotyping with comparisons
to other traditional or deep learning algorithms as baselines. We will demonstrate impact of our methods and
tools through several use cases, ranging from clinical point of care to public health, to translational and precision
medicine. Finally, we will disseminate our work through community activities to advance the state of the art in
health natural language processing.

## Key facts

- **NIH application ID:** 10875460
- **Project number:** 5R01GM114355-08
- **Recipient organization:** BOSTON CHILDREN'S HOSPITAL
- **Principal Investigator:** Steven Bethard
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $448,056
- **Award type:** 5
- **Project period:** 2016-01-01 → 2025-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10875460

## Citation

> US National Institutes of Health, RePORTER application 10875460, Extended Methods and Software Development for Health NLP (5R01GM114355-08). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10875460. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
