Refining Predictive Models for Neglected and Emerging Infectious Diseases

NIH RePORTER · NIH · R35 · $377,500 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Predictive models play an essential role in disease prevention and control. Recent advances in scientific research have allowed more thorough and in-depth data collection from epidemiological studies (e.g., GPS data, climate data, wearable device data). However, due to the many variables collected and the relatively short time frame for epidemiological data collection during some of the epidemics, missing information is unavoidable, and subsequent updates of the database may be necessary. How to incorporate data with partial information, i.e., with missingness, and predictors measured dynamically over time, into existing models to perform more accurate and efficient predictions remains a challenge. Recently, the PI and his team have developed predictive models for various purposes among several neglected and emerging infectious diseases, including schistosomiasis, COVID-19, and human seasonal influenza. While conducting these studies, we identified several practical issues prohibiting a broader implementation of the proposed models, such as missing data and a lack of adaptive mechanisms based on dynamic inflows of predictors. Existing models adopting the complete data analysis approach will significantly reduce the statistical power and cause potential bias. Moreover, predictive models applied in epidemiological infectious disease studies often rely on historical data collected up to a time point without taking into consideration of future data inputs. Meanwhile, the development in statistical and machine learning methods laid the foundation for new dynamic predictive models based on trajectory data, with recent progress in functional concurrent regression and incremental learning. However, these methodological advances have been poorly integrated into field applications. Even in recent COVID-19 research where advanced dynamic models have been developed, balancing the data flow and prediction window has not been well studied. In addition, existing models often require a large amount of variable collection, so a practical two-stage approach allowing limited data collection early on can be more time- and cost-effective. In this MIRA proposal, we aim at refining predictive models for several neglected and emerging infectious diseases. Specifically, three coherent projects with distinct research activities will be pursued, which include: 1) refining hotspot prediction models for schistosomiasis interventions; 2) development and validation of prognostic risk models for COVID-19 in the US, with methods development on missing data handling and functional regression for dynamic prediction; 3) development and validation of a vaccine benefits score for human seasonal influenza. The refined models are expected to be accompanied by new and more general predictive algorithms involving missing data processing and dynamic prediction mechanisms to enhance model performance and adaptability. The methodological development from this proposal will a...

Key facts

NIH application ID
10707496
Project number
5R35GM146612-02
Recipient
UNIVERSITY OF GEORGIA
Principal Investigator
Ye Shen
Activity code
R35
Funding institute
NIH
Fiscal year
2023
Award amount
$377,500
Award type
5
Project period
2022-09-21 → 2027-07-31