# Statistical Methods for Cancer Biomarkers

> **NIH NIH R01** · UNIVERSITY OF MICHIGAN AT ANN ARBOR · 2021 · $272,723

## Abstract

Project Summary/Abstract
Individualized prognostic models abound in clinical biomedicine. They are used to make predictions of the future,
derived from individual patient characteristics, and will play increasingly important roles in the move towards per-
sonalized medicine. They can be used in the settings of early detection and screening, or after a cancer diagnosis
to help decide on treatment, or after treatment to monitor for progression and recurrence. While some models
are well established, they likely have the potential to be improved through the use of additional variables. Larger
and better quality training datasets and improved statistical models and methods will improve their accuracy, but
the potential for largest improvement is through new biomarkers. Since cancer is a heterogenous disease with
multifactorial etiology, many clinical and molecular factors will likely aid in predicting the future for a patient, and
would be candidates for inclusion in a new model. The challenge we will address in this research is how to de-
velop a new model that both includes the new biomarkers and makes use of the knowledge implicit in the existing
models, when the datasets that are available containing the new biomarkers are only of modest size.
To develop a new model from a new dataset of modest size that contains the new biomarkers, the typical approach
will be to analyze these data, as a separate entity, and build a model based on that analysis. However, this
approach does not utilize the external information from an established model. Such external information will often
be available, however it may come in the form of regression coefﬁcients, odds ratios or other summary statistics
for a subset of the variables, or in the form of a prediction from an online calculator. We will consider a variety of
statistical methods for incorporating the external information.
The methods we propose to develop are motivated by speciﬁc head and neck cancer and prostate cancer stud-
ies, but have much broader applicability to other cancers and other diseases. In the head and neck study the
additional new biomarkers to be incorporated in to the prediction models are HPV status and other molecular
biomarkers. For the prostate cancer risk prediction model the new bimarkers are based on proteins measured
from urine.
The research is separated into three speciﬁc aims. The ﬁrst aim considers the situation in which there is a modest
sized new dataset, that includes a new biomarker, and there is an existing prediction model, that does not include
this new biomarker. The external information comes in the form of estimates and standard errors of regression
parameters from an established prediction model based on a subset of the predictors. We propose a number
of different frequentist and Bayesian methods, in which the information on the lower dimensional parameter
space is used via inequality constraints and Lagrange multipliers, through prior distributions and through a no...

## Key facts

- **NIH application ID:** 10199945
- **Project number:** 5R01CA129102-12
- **Recipient organization:** UNIVERSITY OF MICHIGAN AT ANN ARBOR
- **Principal Investigator:** Debashis Ghosh
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $272,723
- **Award type:** 5
- **Project period:** 2009-01-01 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10199945

## Citation

> US National Institutes of Health, RePORTER application 10199945, Statistical Methods for Cancer Biomarkers (5R01CA129102-12). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10199945. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*