# Building Data Science Tools for Genetic Models of Colorectal Cancer Progression and Risk

> **NIH VA I01** · DURHAM VA MEDICAL CENTER · 2024 · —

## Abstract

Colorectal cancer (CRC) is the 2nd leading cause of cancer death in the United States. Screening for
CRC with colonoscopy reduces incidence and mortality. The VA Cooperative Studies Program #380
“Prospective Evaluation of Risk Factors for Colonic Adenomas (>1cm) in Asymptomatic Subjects”
was one of the first studies to demonstrate the safety of screening colonoscopy and highlight the
magnitude of benefit through the removal of precancerous polyps for the prevention of CRC.
However, there is considerable variability in individual risk of CRC that could impact age at initiation
of CRC screening, screening modality and frequency of follow-up. Yet, guidelines do not recognize
this variability, performing too much colonoscopy screening and surveillance in low risk individuals
and not providing enough or timely screening and surveillance in high-risk individuals. Genetic and
genomic data offer a promising strategy to improve CRC risk prediction and better target CRC
screening resources. However, what is missing in these genomic risk calculations is recognition of the
timing of certain genetic changes and of those changes in CRC precursors in a clinically meaningful
time course to predict the timing of future CRC. Longitudinal studies of CRC precursors and
progression, identification of genetic risk factors for progression, and incorporation into personalized
risk models can only be accomplished through development of an integrated data resource and
application of new statistical models. We have begun to develop the resources to allow the large-
scale analyses needed to develop comprehensive clinical and genetic risk models. This proposal is
built on our work CSP#380 which incorporates a longitudinal research program of 3121 Veterans who
underwent screening colonoscopy between 1994 and 1997 and have been followed for 20 years. We
have used the CSP#380 research database to apply emerging statistical models for longitudinal
cohorts that incorporate the clinical information from each colonoscopy, allowing estimates of
informative follow-up times while taking the competing risk of mortality over time into account. We
have also extended the biorepository for CSP#380 to include pathology specimens obtained at
colonoscopy to provide a longitudinal tissue resource. The goal of this proposal is to extend the
approach and models developed in CSP#380 to the VA Colonoscopy Cohort (VACC), which includes
all Veterans with exposure to colonoscopy in the VA. We will perform extensive testing in CSP#380
to provide estimates of data quality. This much larger VA data set will allow more thorough discovery
and testing of genetic factors in CRC risk models. This ambitious project will combine development
of a curated phenotype library with histopathology results generated from VA medical records with the
joint longitudinal models applied to CSP#380. We will begin to explore how to incorporate genetic
information into these models through pilot studies in CSP#380 with the fu...

## Key facts

- **NIH application ID:** 10765600
- **Project number:** 5I01BX005718-02
- **Recipient organization:** DURHAM VA MEDICAL CENTER
- **Principal Investigator:** Elizabeth R Hauser
- **Activity code:** I01 (R01, R21, SBIR, etc.)
- **Funding institute:** VA
- **Fiscal year:** 2024
- **Award amount:** —
- **Award type:** 5
- **Project period:** 2022-10-01 → 2026-09-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10765600

## Citation

> US National Institutes of Health, RePORTER application 10765600, Building Data Science Tools for Genetic Models of Colorectal Cancer Progression and Risk (5I01BX005718-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10765600. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
