Learning Precision Medicine for Rare Diseases Empowered by Knowledge-driven Data Mining

NIH RePORTER · NIH · R01 · $696,216 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT Despite their individual rarity, rare diseases collectively affect one in eleven Americans. Rare disease patients often face significant diagnostic delays, waiting an average of 6 years from the onset of symptoms to an accurate diagnosis. Recent advances in precision medicine have accelerated research in rare diseases, overwhelming clinicians’ capacities to manage and leverage the latest knowledge efficiently in clinical practice. For example, novel gene mutations related to idiopathic pulmonary fibrosis (IPF) frequently do not appear in the Human Gene Mutation Database (HGMD) or other knowledge bases and are only present in initial articles. Additionally, due to the lack of clinical evidence and empirical knowledge, awareness of rare diseases remains low among healthcare providers and is a major reason for diagnostic odysseys experienced by many patients, in practice. Teaming up Mayo Clinic Program for Rare and Undiagnosed Diseases (PRaUD) with the partnership of Vanderbilt University Medical Center (VUMC), we aim to address the translation gap by building a novel end- to-end informatics framework to accelerate the diagnosis of rare diseases. We plan to achieve the development of the proposed framework through three specific aims. Aim 1 is to construct RDAccelerate, a computable rare disease knowledge hub that accumulates and maintains up-to-date knowledge for rare diseases. It is costly to stay current with the literature and informed with clinical evidence and empirical experience. To address this, we will leverage the latest natural language processing (NLP) techniques such as pre-trained language models (PLMs) and data mining techniques such as graph neural network (GNN) embeddings to accelerate the extraction, integration, and mining of associations from a diverse range of resources. Aim 2 focuses on the provision of RDRecommend, a deep phenotype-driven system for rare disease differential diagnoses trained with the up-to-date knowledge in RDAccelerate and longitudinal patient records of rare disease cohorts. It often takes substantial time and effort for an accurate diagnosis due to the rarity. We therefore propose to apply various recommendation techniques to suggest rare disease differential diagnoses. We will then develop RDConnect, a web portal to search information, display differential diagnostic recommendations, and collect clinical evidence automatically for further validation in Aim 3. The proposed informatics framework will be evaluated through several practice projects at PRaUD in collaboration with clinical co-Investigators. The framework will be developed through team science collaboration using two rare diseases (IPF and mastocytosis). We will then validate the framework in supporting two other rare diseases (hypereosinophilic syndrome [HES] and rare kidney stone) before scaling up to a broad spectrum of rare diseases. The external generalizability of the solution will be tested through our subsite pa...

Key facts

NIH application ID
10922807
Project number
5R01HG012748-02
Recipient
UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON
Principal Investigator
HONGFANG LIU
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$696,216
Award type
5
Project period
2023-09-06 → 2027-06-30