Using natural language processing and machine learning to identify potentially preventable hospital admissions among outpatients with chronic lung diseases

NIH RePORTER · NIH · K23 · $173,801 · view on reporter.nih.gov ↗

Abstract

Project Summary Patients living with chronic lung diseases (CLDs) are frequently admitted to the hospital for potentially preventable causes. Such admissions may be discordant with patient preferences and/or represent a low-value allocation of health system resources. To anticipate such admissions, existing clinical prediction models in this field typically produce an “all-cause” risk estimate which, even if accurate, overlooks the actionable mechanisms behind admission risk and therefore fails to identify a prescribed response. This limitation may explain the only modest – at best – reductions in hospital admissions and readmissions seen in most intervention bundles that have been tested in this population. An opportunity exists, therefore, to predict hospitalization risk while simultaneously identifying patient phenotypes (i.e. some constellation of social, demographic, clinical, and other characteristics) for which known preventive interventions exist. The proposed study seeks to overcome these limitations and capitalize on this opportunity by (1) conducting semi-structured interviews with hospitalized patients with CLDs, and their caregivers and clinicians, to directly identify modifiable risks and their associated phenotypes driving hospital admissions; (2) using natural language processing techniques (NLP) to build classification models that will leverage nuanced narrative, social, and clinical information in the unstructured text of clinical encounter notes to identify patients with these phenotypes; and (3) building risk prediction model focused on actionable phenotypes with a wide-array of traditional regression and machine learning approaches while also incorporating large numbers of predictor variables from text data and accounting for time-varying trends. The candidate's preliminary work using basic NLP techniques to significantly improve the discrimination of clinical prediction models in an inpatient population has motivated this methodologic approach. The rising burden and costs of hospitalizations associated with CLDs, and the increasing attention from federal payers, highlights the critical nature of this work. Completion of this research will build upon the candidate's past training, which includes a Masters of Science in Health Policy Research obtained with NHLBI T32 support, and will provide the experience, education, and mentorship to allow the candidate to become a fully independent investigator. Based on the candidate's tailored training plan, he will acquire advanced skills in mixed-methods research, NLP, and trial design all through coursework, close mentoring and supervision, and direct practice. The skills will position him ideally to submit successful R01s testing the deployment of the proposed clinical prediction models in real-world settings. The candidate's primary mentor, collaborators, and advisors will ensure adherence to the proposed timeline and goals and provide a supportive environment for him to develop ...

Key facts

NIH application ID: 10383738
Project number: 5K23HL141639-05
Recipient: UNIVERSITY OF PENNSYLVANIA
Principal Investigator: Gary Weissman
Activity code: K23
Funding institute: NIH
Fiscal year: 2022
Award amount: $173,801
Award type: 5
Project period: 2018-04-09 → 2023-03-31