Data-driven search of Common Fund data sets for better discoverability and novel meta-analysis

NIH RePORTER · NIH · R03 · $312,973 · view on reporter.nih.gov ↗

Abstract

Project Summary NIH Common Fund (CF) programs have produced a number of unique and high-value data sets. To solve complex biomedical questions, we need to find related data sets that can be co-analyzed for specific study purposes. Many of the current search techniques depend on data descriptors which differ across CF programs and may be incomplete or inaccurate. Many of these experiments output lists of genes significant to certain biomedical conditions. We are proposing to use these gene lists to find similar data sets. This approach will not only enable searching across CF data sets but also can connect them to other experiments in other databases and biomedical catalogs, e.g., databases containing disease-gene associations and molecular pathways. To achieve this aim, we will implement an efficient linear algorithm to calculate similarities between large numbers of gene sets. Our prototype tool, DBRetina, uses this algorithm to build huge similarity networks in few minutes using minimal computational resources. DBRetina serves as the foundation for CurIndex, a study similarity graph database that connects multiple health-related resources. DBRetina and CurIndex will allow advanced search for related CF experiments and facilitate better interpretation of biomedical data.

Key facts

NIH application ID: 10577377
Project number: 1R03OD034502-01
Recipient: UNIVERSITY OF CALIFORNIA AT DAVIS
Principal Investigator: Tamer Ahmed Mansour Ahmed
Activity code: R03
Funding institute: NIH
Fiscal year: 2022
Award amount: $312,973
Award type: 1
Project period: 2022-09-20 → 2025-09-19