# Trustworthy and High-Performance Question Answering for Electronic Health Records

> **NIH NIH R01** · UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON · 2024 · $351,000

## Abstract

ABSTRACT
Patients accumulate large volumes of information in their electronic health record (EHR), and finding this
information often proves difficult, especially given the usability issues associated with EHRs. One of the most
intuitive means of encoding information needs is with natural language questions thus automatic question
answering (QA) methods provide an intuitive interface for quickly accessing patient information.
An emerging and promising solution to this QA problem is the use of large language models (LLMs), which are
natural language processing (NLP) models trained on significant amounts of natural language text. LLMs
provide an excellent building block for training high-performance QA systems and have been shown to be
excellent at answering medical questions However, LLM-based models have also been shown to provide
completely false information (“hallucinations”). When applied to EHR QA, this could result in clinicians making
diagnosis and treatment decisions based on invalid information about their patient, easily leading to harms. The
naïve use of such systems in a clinical environment is thus seen as unacceptable to some. Further, LLMs are not
well-suited to the structured data found in EHRs as the language models are trained only on unstructured text.
On the other hand, there are systems designed from the ground up to be trustworthy and well-suited to the
clinical environment. As part of our preliminary work, we developed the quEHRy system that understands when
a question is outside its capabilities and thus rarely returns a wrong answer. There is still much room for
improvement for quEHRy, however, as it currently only answers questions for structured data and its
understanding of medical concepts requires improvement. In other words, the strengths and weaknesses of
quEHRy are well-complemented by LLM-based QA systems (and vice versa). The key question this proposal tries
to answer, then, is how to combine such systems to achieve a single EHR QA system that is both trustworthy and
high-performance.
To this end, we hypothesize it is possible to create hybrid methods leveraging the power of LLMs while
maintaining the trustworthiness of carefully-designed systems like quEHRy. Further, this both requires and
enables QA over both structured and unstructured EHR data, so we will investigate ways to better understand
the alignment between structured and unstructured data, and how this can be leveraged for EHR QA.

## Key facts

- **NIH application ID:** 10858562
- **Project number:** 1R01LM014508-01
- **Recipient organization:** UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON
- **Principal Investigator:** Kirk Edward Roberts
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $351,000
- **Award type:** 1
- **Project period:** 2024-09-16 → 2028-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10858562

## Citation

> US National Institutes of Health, RePORTER application 10858562, Trustworthy and High-Performance Question Answering for Electronic Health Records (1R01LM014508-01). Retrieved via AI Analytics 2026-05-26 from https://api.ai-analytics.org/grant/nih/10858562. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*