# Project 1: Overcoming methodologic barriers to analysis of observational clinico-genomic data in oncology

> **NIH NIH P01** · SLOAN-KETTERING INST CAN RESEARCH · 2024 · $286,731

## Abstract

PROJECT 1 ABSTRACT
Overcoming methodologic barriers to analysis of observational clinico-genomic data in oncology
Project Leaders: Kenneth Kehl (DFCI); Deborah Schrag (MSK) Precision oncology, which seeks to identify
biomarkers to guide treatment selection for individual patients, has been applied increasingly in cancer
research and clinical care. Pursuing this objective requires access to large databases of tumors that have been
both molecularly characterized and clinically annotated. However, the absence of scalable methods for
gathering and analyzing the clinical endpoints necessary to pursue patient-relevant research questions has
been a major barrier to constructing such datasets. Key cancer outcomes, including response to treatment, are
generally not recorded in a structured format in “real-world” electronic health record (EHR) datasets. Extraction
of such outcomes from EHRs has historically required resource-intensive manual medical records review,
which has in turn has suffered from the lack of a standardized data model for medical record annotation across
studies. Real-world molecular testing and follow-up patterns, which may be correlated with endpoints of
interest, constitute an additional challenge to clinico-genomic analysis. Methods to reliably extract clinically
interpretable, reproducible endpoints from EHRs are necessary to advance precision oncology. The
overarching objective of this proposal is to develop, refine, and test such methods at scale. Towards this end,
we have developed the Pathology, Radiology/Imaging, Signs/Symptoms, Medical oncologist assessment, and
bioMarkers (PRISSMM) data model for extracting structured, reproducible cancer outcomes. PRISSMM
provides a rubric for abstraction of specific cancer outcomes from individual imaging reports and medical
oncologist notes and can be used by investigators at any health care system, agnostic to EHR vendor. These
outcomes include the presence of cancer within specific EHR imaging reports and clinical notes, including
assessments of tumor at specific body sites; progression/worsening; and response/improvement. Annotations
of individual reports along the disease trajectory can then be analyzed to derive relevant endpoints, such as
progression-free survival. Still, these “real-world” endpoints will only be useful if they (1) are acceptable to
diverse stakeholders; (2) can be extracted at scale; and (3) can be analyzed using methods that facilitate
unbiased inference. In this project, we will evaluate novel PRISSMM endpoints by measuring associations
among PRISSMM outcomes, traditional RECIST endpoints, and overall survival; train and validate machine
learning/”AI” models to extract endpoints at the scale of a large cross-institutional clinico-genomic dataset; and
develop best practices for time-to-event analysis given informative cohort entry and follow-up patterns in
clinico-genomic data. This project will advance methods for cancer outcome analysis based on real-world
evide...

## Key facts

- **NIH application ID:** 10768975
- **Project number:** 1P01CA275746-01A1
- **Recipient organization:** SLOAN-KETTERING INST CAN RESEARCH
- **Principal Investigator:** Deborah Schrag
- **Activity code:** P01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $286,731
- **Award type:** 1
- **Project period:** 2024-09-03 → 2029-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10768975

## Citation

> US National Institutes of Health, RePORTER application 10768975, Project 1: Overcoming methodologic barriers to analysis of observational clinico-genomic data in oncology (1P01CA275746-01A1). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10768975. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
