# Fine-Grained Spatial Information Extraction For Radiology Reports

> **NIH NIH R21** · UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON · 2021 · $269,607

## Abstract

ABSTRACT
Automated medical image classification has seen enormous performance improvements recently,
particularly in radiology. The application of these approaches to Alzheimer's Disease (AD), however, has
been limited due to relatively small datasets and the limited granularity of their corresponding
phenotypes. The dataset size issue is problematic as the machine learning (ML) methods that have
achieved such remarkable performance often require enormous amounts of labeled data for training.
Furthermore, the phenotype granularity issue impedes the targeted studying of AD along the lines of
what is seen in the “precision medicine” approaches to diseases such as cancer. Solutions exist, however,
as an increasingly accepted means of acquiring large amounts of labeled data is through the use of
natural language processing (NLP) on the free-text reports associated with an image If a radiology report
describes a patient's AD-related finding, the associated image(s) can be used to train an image classifier.
The parent project to this supplemental proposal (R21EB029575) proposes just such a NLP method while
simultaneously solving the granularity issue by extracting fine-grained spatial information from the
report. In the parent project, we are developing NLP resources and methods to improve the automated
labeling of radiology images using the corresponding study reports. The parent is not specific to AD (or
any disease), so this supplement will enable us to focus on this particularly important disease, which will
benefit significantly from improved ML-based imaging. We will focus on MRI and PET scans. The Aims
here parallel the parent project, each focusing on methods that specifically improve NLP for AD
radiological indicator extraction as well as the validation of image classification from the corresponding
labels.
These Aims include (1) extending the spatial representation and corpus for Alzheimer's, (2) extending the
NLP methods for automatic extraction, and (3) validating the AD-related labels for use in image
classification.
The long-term impact of this project is to substantially improve AD diagnosis by scaling up the amount of
labeled data available to ML-based classifiers. The short-term goal supplement is to focus our
NLP/Imaging combination research on the complex task of improving AD diagnosis. By extending our
project with a specific target for AD, we will initiate a sizable research effort toward this goal.

## Key facts

- **NIH application ID:** 10288320
- **Project number:** 3R21EB029575-02S1
- **Recipient organization:** UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON
- **Principal Investigator:** Kirk Edward Roberts
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $269,607
- **Award type:** 3
- **Project period:** 2020-03-01 → 2022-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10288320

## Citation

> US National Institutes of Health, RePORTER application 10288320, Fine-Grained Spatial Information Extraction For Radiology Reports (3R21EB029575-02S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10288320. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*