# High-dimensional variable selection and prediction of ordinal pathological response data

> **NIH NIH R03** · OHIO STATE UNIVERSITY · 2020 · $74,065

## Abstract

Health status and outcomes are frequently measured on an ordinal scale. For example, in acute lymphoblastic
leukemia, minimal residual disease is an initial measure of treatment response that has been strongly predictive
of event free survival and risk of relapse, where patients are commonly stratiﬁed into one of three ordinal groups:
standard, intermediate, or high risk. In acute myeloid leukemia, based on cytogenetic ﬁndings and selected muta-
tions at diagnosis, the European LeukemiaNet (ELN) classiﬁcation system assigns patients into one of three risk
groups: favorable, intermediate, or adverse. Molecular features monotonically associated with these ordinal re-
sponses may be prognostically relevant or potential therapeutic targets, so linking these ordinal responses to data
from high-throughput genomic assays is of clinical interest. We previously developed frequentist-based penalized
ordinal response models and software to enable modeling an ordinal response when high-dimensional genomic
data comprises the predictor space. Although frequentist-based penalized models provide a sparse solution and
so perform automatic variable selection, they require some method for selecting the penalty parameter (e.g., AIC,
BIC, or cross-validation) to identify a ﬁnal model. However, once a speciﬁc penalty value is selected, all parameter
estimates are conditional on that value. Also, the frequentist-based approach does not yield much information
about the coefﬁcients other than whether they are non-zero or not. That is, there are no resulting conﬁdence
intervals or p-values associated with the coefﬁcient estimates. Therefore this project will ﬁll a critical barrier to
progress in this ﬁeld by developing penalized Bayesian ordinal response models applicable for high-dimensional
datasets. Advantages of the Bayesian approach is that there is no need to select a value for the penalty param-
eter and it yields credible intervals which provide useful interpretations about the signiﬁcance of each predictor.
The speciﬁc aims of this application are to: (1) Develop penalized Bayesian cumulative link, adjacent category,
and stereotype logit models for high-dimensional datasets; (2) Develop penalized Bayesian forward continuation
ratio (FCR) models with a complementary log-log link that allow for censoring for high-dimensional datasets. For
both aims we will characterize the performance of the methods using extensive simulation studies and application
to publicly available cancer datasets, develop software, and distribute R packages to CRAN. This research will
ﬁll a critical gap as there are currently no Bayesian LASSO ordinal response models for high-dimensional data.
Through our proposed variable inclusion indicator methodology, our Bayesian approach and software developed
in this application will provide unique research methods for integrating clinical, demographic, high-throughput
genomic, and ordinal response data. Moreover, the ordinal response extensions propo...

## Key facts

- **NIH application ID:** 9879969
- **Project number:** 1R03CA245771-01
- **Recipient organization:** OHIO STATE UNIVERSITY
- **Principal Investigator:** Kellie J. Archer
- **Activity code:** R03 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $74,065
- **Award type:** 1
- **Project period:** 2019-12-13 → 2021-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9879969

## Citation

> US National Institutes of Health, RePORTER application 9879969, High-dimensional variable selection and prediction of ordinal pathological response data (1R03CA245771-01). Retrieved via AI Analytics 2026-06-14 from https://api.ai-analytics.org/grant/nih/9879969. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*