# Deep clinical trajectory modeling to optimize accrual to cancer clinical trials

> **NIH NIH K99** · DANA-FARBER CANCER INST · 2020 · $170,176

## Abstract

PROJECT SUMMARY/ABSTRACT
Electronic health records (EHRs) are now ubiquitous in routine cancer care delivery. The large volumes of data
that EHRs contain could constitute an important resource for research and quality improvement, but to date,
EHRs have not fully realized this potential. Important clinical endpoints, such as disease histology, stage,
response, progression, and burden, are often recorded in the EHR only in unstructured free-text form. Even
when structured data are available, they may be recorded only at one point in time, such as diagnosis, and
may not be as relevant later in a patient's dynamic disease trajectory. These barriers prevent scalable analysis
of EHR data for even relatively straightforward research tasks, such as identification of a cohort of patients
potentially eligible for clinical trials. Identifying patients for trials is an important challenge in cancer research,
since under 5% of adults with cancer have historically enrolled in therapeutic trials. Tools are in development to
better match patients to trials, but no such tools are both publicly available and capable of incorporating time-
specific patient phenotypes generated using unstructured EHR data. Recent rapid innovation in deep learning
techniques could provide novel solutions to these challenges. In ongoing work, I have found that natural
language processing based on a neural network architecture can reliably extract clinically relevant oncologic
endpoints from free-text radiology reports. My goal is to develop an independent research program focused on
leveraging such methods to put the EHR to use at scale for discovery and improving cancer care delivery. My
specific aims are (1) to develop and validate a clinically relevant, dynamic, pre-trained cancer trajectory model
by applying deep learning to integrated structured and unstructured EHR data; (2) to apply transfer learning to
a pre-trained cancer trajectory model to match patients to clinical trials using EHR data and clinical trial
protocols; and (3) to pilot the incorporation of cancer trajectory modeling into an institutional clinical trial
matching tool. In the near term, this work will facilitate accrual to clinical trials at our institution. During the
independent research portion of the proposal, it will constitute the basis for a general framework for conducting
scalable cancer research using EHR data.

## Key facts

- **NIH application ID:** 9880481
- **Project number:** 1K99CA245899-01
- **Recipient organization:** DANA-FARBER CANCER INST
- **Principal Investigator:** Kenneth L Kehl
- **Activity code:** K99 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $170,176
- **Award type:** 1
- **Project period:** 2020-02-01 → 2022-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9880481

## Citation

> US National Institutes of Health, RePORTER application 9880481, Deep clinical trajectory modeling to optimize accrual to cancer clinical trials (1K99CA245899-01). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9880481. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
