# New statistical methods and software for modeling complex multivariate survival data with large-scale covariates

> **NIH NIH R01** · UNIVERSITY OF PITTSBURGH AT PITTSBURGH · 2022 · $301,305

## Abstract

ABSTRACT
In randomized clinical trials and observational studies, multivariate outcomes are increasingly used as co-
primary endpoints to study complex diseases or clinical outcomes comprised of co-morbidities. Some modern
studies also collect large-scale genetics or image data for the potential of individualized risk prediction and
precision medicine development. Moreover, the precise event times for non-fatal events are sometimes
unobservable because the event status can only be determined at intermittent assessment times. The non-
fatal events may also be censored by fatal events (i.e., death) which results in semi-competing risks data. The
complex multivariate survival outcome together with large-scale covariates pose great analytical challenges for
such studies. Inspired by the challenges and opportunities met in our motivating studies for two bilateral
diseases, Age-related Macular Degeneration (AMD) and Acute Otitis Media (AOM), as well as the wealthy data
from the hormone therapy trial in Women Health Initiative (WHI) and the Alzheimer Disease Neuroimaging
Initiative (ADNI), the broad aim of this proposal is to develop new statistical and machine learning methods and
computational tools for analyzing such data. First, we will develop a class of semiparametric copula models
that flexibly joint model the multivariate survival data without ad-hoc data simplification. A rigorous goodness-
of-fit test will be proposed for model diagnostics. Next, using the top risk factors identified from the
semiparametric copula model as inputs, we will develop a multivariate survival deep neural network to predict
individualized disease risk proﬁles over time, which are critical for personalized disease prevention and clinical
management. Then, based on fundamental multiple testing principles, we propose a novel simultaneous
inference procedure to identify and infer subgroups with enhanced treatment efficacy under our proposed
copula framework. Finally, we will develop a meta-learner framework to estimate individualized treatment
effects and to give treatment recommendation rules. The novel methodology will be immediately applied to the
ongoing AMD, AOM and AD research at the University of Pittsburgh, as well as the data from WHI and ADNI to
facilitate novel analyses for identifying risk factors and assessing treatment effects on disease progression,
recurrence, or prevention. The methodology advances will be applicable to a broad range of studies with
similar data features. In summary, the successful completion of the project will lead to a comprehensive
methodological framework with ready-to-use software packages, which have the potential to fundamentally
improve the current practice in analyzing such studies, and thus to enhance the discovery of disease risk
factors, improve the prediction of disease progression profiles, and increase the success of precision medicine.

## Key facts

- **NIH application ID:** 10453875
- **Project number:** 1R01GM141076-01A1
- **Recipient organization:** UNIVERSITY OF PITTSBURGH AT PITTSBURGH
- **Principal Investigator:** Ying Ding
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $301,305
- **Award type:** 1
- **Project period:** 2022-06-01 → 2026-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10453875

## Citation

> US National Institutes of Health, RePORTER application 10453875, New statistical methods and software for modeling complex multivariate survival data with large-scale covariates (1R01GM141076-01A1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10453875. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
