# Radiotherapy-associated breast cancer: machine learning on genotypes to predict individualized risk

> **NIH NIH R21** · SLOAN-KETTERING INST CAN RESEARCH · 2020 · $293,870

## Abstract

SUMMARY
Risk of developing contralateral breast cancer is a major concern among breast cancer survivors, especially for
those who received radiotherapy for a first primary breast cancer. The risk of developing radiation-associated
contralateral breast cancer (RCBC) is further increased among those who were exposed to radiation at an
early age. Several genotyping studies have shown that variation in the individual risk of developing RCBC is
associated with single nucleotide polymorphism (SNP) genetic variants. However, these studies have mainly
analyzed a limited set of target/candidate SNPs that had been associated with general primary breast cancer
in prior studies. This approach, building predictive models based on a small set of SNPs, has made marginal
progress in distinguishing individual risk of RCBC. To the contrary, complex phenotypes or traits are likely the
result of interactions of many biological sub-systems, most of which individually provide small effect size to
predictive models, incrementally improving risk prediction. We have recently developed novel machine learning
methods that use genome-wide SNPs to build patient-specific risk models of radiation-induced toxicity. These
models use hundreds of SNPs in a nonlinear fashion and can be used to identify key biological correlates. Our
long-term goal is to develop a clinical decision support tool that can be used to guide radiotherapy
treatment decisions based on individual risk of RCBC. To improve patient-specific risk prediction of
RCBC, we propose to apply these innovative methods to a rich dataset of the Women’s Environmental
Cancer and Radiation Epidemiology (WECARE) Study. Under SA1: Genome-wide genotyping of the
WECARE Study II, as part of this grant, we will complete genome-wide association studies (GWAS)
genotyping of 1626 samples from the WECARE Study II. Under SA2.1: Predictive modeling and biological
analysis, we will apply our novel machine learning methods to the combined WECARE Study I and II to design
a predictive model of RCBC risk in a young subpopulation treated with radiotherapy, using GWAS genotyping,
clinical, and radiation data. We will also use bioinformatics methods to identify key biological correlates
associated with RCBC risk. Under SA2.2: Comparison of biological correlates between subgroups, we will
further investigate biological processes associated with radiation-unrelated contralateral breast cancer for the
combined cohort in the WECARE Study I and II who did not receive radiotherapy. The resulting biological
correlates will be compared with those found in SA2.1 for radiotherapy-treated women to better understand
RCBC-specific biological mechanisms. Our model validation using an independent series of childhood cancer
survivors who have developed radiation-associated breast cancer will enable us to examine the reliability and
reproducibility of the model as a decision-making tool. If the RCBC risk model is validated, it will provide a
clinical guide to ...

## Key facts

- **NIH application ID:** 9878491
- **Project number:** 1R21CA234752-01A1
- **Recipient organization:** SLOAN-KETTERING INST CAN RESEARCH
- **Principal Investigator:** JONINE L. BERNSTEIN
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $293,870
- **Award type:** 1
- **Project period:** 2020-02-11 → 2022-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9878491

## Citation

> US National Institutes of Health, RePORTER application 9878491, Radiotherapy-associated breast cancer: machine learning on genotypes to predict individualized risk (1R21CA234752-01A1). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9878491. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
