New statistical methods and software for modeling complex multivariate survival data with large-scale covariates

NIH RePORTER · NIH · R01 · $301,305 · view on reporter.nih.gov ↗

Abstract

ABSTRACT In randomized clinical trials and observational studies, multivariate outcomes are increasingly used as co- primary endpoints to study complex diseases or clinical outcomes comprised of co-morbidities. Some modern studies also collect large-scale genetics or image data for the potential of individualized risk prediction and precision medicine development. Moreover, the precise event times for non-fatal events are sometimes unobservable because the event status can only be determined at intermittent assessment times. The non- fatal events may also be censored by fatal events (i.e., death) which results in semi-competing risks data. The complex multivariate survival outcome together with large-scale covariates pose great analytical challenges for such studies. Inspired by the challenges and opportunities met in our motivating studies for two bilateral diseases, Age-related Macular Degeneration (AMD) and Acute Otitis Media (AOM), as well as the wealthy data from the hormone therapy trial in Women Health Initiative (WHI) and the Alzheimer Disease Neuroimaging Initiative (ADNI), the broad aim of this proposal is to develop new statistical and machine learning methods and computational tools for analyzing such data. First, we will develop a class of semiparametric copula models that flexibly joint model the multivariate survival data without ad-hoc data simplification. A rigorous goodness- of-fit test will be proposed for model diagnostics. Next, using the top risk factors identified from the semiparametric copula model as inputs, we will develop a multivariate survival deep neural network to predict individualized disease risk profiles over time, which are critical for personalized disease prevention and clinical management. Then, based on fundamental multiple testing principles, we propose a novel simultaneous inference procedure to identify and infer subgroups with enhanced treatment efficacy under our proposed copula framework. Finally, we will develop a meta-learner framework to estimate individualized treatment effects and to give treatment recommendation rules. The novel methodology will be immediately applied to the ongoing AMD, AOM and AD research at the University of Pittsburgh, as well as the data from WHI and ADNI to facilitate novel analyses for identifying risk factors and assessing treatment effects on disease progression, recurrence, or prevention. The methodology advances will be applicable to a broad range of studies with similar data features. In summary, the successful completion of the project will lead to a comprehensive methodological framework with ready-to-use software packages, which have the potential to fundamentally improve the current practice in analyzing such studies, and thus to enhance the discovery of disease risk factors, improve the prediction of disease progression profiles, and increase the success of precision medicine.

Key facts

NIH application ID
10453875
Project number
1R01GM141076-01A1
Recipient
UNIVERSITY OF PITTSBURGH AT PITTSBURGH
Principal Investigator
Ying Ding
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$301,305
Award type
1
Project period
2022-06-01 → 2026-05-31