Semiparametric Regression Analysis of Interval-Censored Data in Current Cohort Studies

NIH RePORTER · NIH · R01 · $396,020 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY In epidemiological cohort studies, the onset of an asymptomatic disease (e.g., diabetes, hypertension, chronic obstructive pulmonary disease, HIV infection, SARS-CoV-2 infection, cancer, or dementia) cannot be observed directly but rather is known to occur sometime between two consecutive clinical examinations. The two examina- tions bookend a time interval, such that the event time is “interval-censored”. It is highly challenging to analyze interval-censored data because none of the event times is exactly known; therefore, investigators have resorted to statistical methods that are unreliable or even invalid. The broad, long-term objectives of this research project are to develop semiparametric regression models, with associated inference procedures and numerical algo- rithms, for analyzing interval-censored data from current epidemiological investigations. The specific aims of the project are: (1) to explore semiparametric regression models for assessing the impact of an interval-censored event (e.g., onset of diabetes) on future outcomes (e.g., stroke, heart attack, methylation level); (2) to build a system of proportional intensity models with random effects for analyzing interval-censored multi-state processes that characterize disease progression over time; (3) to provide graphical and numerical techniques for checking the adequacy of semiparametric regression models with interval-censored data; and (4) to relax the proportional hazards assumption by allowing time-varying regression coefficients. All of these aims are motivated by the unmet methodological needs in the cohort studies that the investigators are currently conducting and address the most timely and important issues in human population health research. The estimation of model parame- ters is based on nonparametric likelihood (with an arbitrary event-time distribution) and other sound statistical principles. The large-sample properties of the estimators will be established rigorously through innovative use of modern empirical process theory, semiparametric efficiency theory, and other advanced mathematical argu- ments. Computationally efficient and stable algorithms will be created to implement the inference procedures. The operating characteristics of the numerical algorithms and inference procedures will be evaluated extensively through simulation studies that mimic real data. The proposed methods will be applied to the Atherosclerosis Risk in Communities Study and the SubPopulations and InteRmediate Outcome Measures In COPD Study, both of which are being carried out at the University of North Carolina at Chapel Hill. These studies exemplify the broad challenges and opportunities arising from modern epidemiological research. The results will be published in both statistical and medical journals. Efficient, reliable, user-friendly, open-access, and well-documented R packages will be produced and disseminated to the broad scientific community. This research will c...

Key facts

NIH application ID
10853505
Project number
1R01HL173128-01
Recipient
UNIVERSITY OF MICHIGAN AT ANN ARBOR
Principal Investigator
Donglin Zeng
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$396,020
Award type
1
Project period
2024-03-10 → 2028-02-29