A Complex Disease Genetics Knowledge Provider for Biomedical Data Translator

NIH RePORTER · NIH · OT2 · $484,013 · view on reporter.nih.gov ↗

Abstract

A major goal of the Biomedical Data Translator Program is to facilitate disease classification based on molecular and cellular abnormalities. While many experimental approaches exist to interrogate molecular or cellular processes, few can discern which among a host of potential abnormalities are relevant to disease in the human system. Genetic variants associated with disease are unique in providing molecular alterations causally related to human disease risk. There are two types of genetic associations. Rare disease associations can (usually) be clearly linked to a gene and are well represented by catalogs such as ClinVar, OMIM, and Monarch. Complex disease associations are harder to interpret because they (a) are statistical rather than qualitative and (b) usually lie in noncoding genomic regions that cannot be immediately translated to molecular or cellular abnormalities. Many complementary resources to help in the biological translation of complex disease associations have recently emerged, broadly classifiable as either “functional genomic” datasets (e.g. from epigenomic profiling or chromatin capture) or predictive bioinformatic methods (e.g. that integrate various genetic and functional genomic datasets to predict disease-susceptibility genes or pathways). These resources require expertise to curate and interpret, and there is as yet no knowledge source that integrates them to interpret complex disease associations. Furthermore, techniques for harmonizing heterogeneous functional genomic datasets with respect to one another are not yet established, most predictive bioinformatic methods specify complex data-processing pipelines that have not yet been scaled to run across many diseases, and there are few if any “gold standards” to evaluate the molecular or cellular abnormalities identified by these resources. The goal of our proposed project is to address these gaps within a complex disease genetics Knowledge Provider for Translator. We are experts in complex disease genetics and maintain the Knowledge Portal Network (KPN), a collection of open source web portals and Smart APIs that make integrated genetic and genomic datasets publicly accessible for >180 complex diseases. We have built the KPN by developing a protocol for working with disease experts to aggregate and curate high-confidence genetic datasets, building computational pipelines to harmonize these data and apply predictive bioinformatic methods upon them, and extracting relationships mined from these data into a Neo4J graph database. We propose to use the KPN as a foundation to implement a Translator Knowledge Provider of high-confidence complex disease associations and predicted disease-relevant molecular and cellular abnormalities. We will implement this Knowledge Provider by (a) expanding the data sources, data types, and bioinformatic methods integrated within the KPN; (b) developing new computational algorithms to improve the ability of genetic data to identify molecular and...

Key facts

NIH application ID
10548478
Project number
3OT2TR003433-01S2
Recipient
BROAD INSTITUTE, INC.
Principal Investigator
Jason Flannick
Activity code
OT2
Funding institute
NIH
Fiscal year
2022
Award amount
$484,013
Award type
3
Project period
2020-01-23 → 2022-11-30