# A Complex Disease Genetics Knowledge Provider for Biomedical Data Translator

> **NIH NIH OT2** · BROAD INSTITUTE, INC. · 2020 · $530,030

## Abstract

A major goal of the Biomedical Data Translator Program is to facilitate disease classification
based on molecular and cellular abnormalities. While many experimental approaches exist to
interrogate molecular or cellular processes, few can discern which among a host of potential
abnormalities are relevant to disease in the human system. Genetic variants associated with
disease are unique in providing molecular alterations causally related to human disease risk.
There are two types of genetic associations. Rare disease associations can (usually) be
clearly linked to a gene and are well represented by catalogs such as ClinVar, OMIM, and
Monarch. Complex disease associations are harder to interpret because they (a) are statistical
rather than qualitative and (b) usually lie in noncoding genomic regions that cannot be
immediately translated to molecular or cellular abnormalities. Many complementary resources to
help in the biological translation of complex disease associations have recently emerged,
broadly classifiable as either “functional genomic” datasets (e.g. from epigenomic profiling or
chromatin capture) or predictive bioinformatic methods (e.g. that integrate various genetic and
functional genomic datasets to predict disease-susceptibility genes or pathways). These
resources require expertise to curate and interpret, and there is as yet no knowledge source
that integrates them to interpret complex disease associations. Furthermore, techniques for
harmonizing heterogeneous functional genomic datasets with respect to one another are not yet
established, most predictive bioinformatic methods specify complex data-processing pipelines
that have not yet been scaled to run across many diseases, and there are few if any “gold
standards” to evaluate the molecular or cellular abnormalities identified by these resources.
The goal of our proposed project is to address these gaps within a complex
disease genetics Knowledge Provider for Translator. We are experts in complex disease
genetics and maintain the Knowledge Portal Network (KPN), a collection of open source web
portals and Smart APIs that make integrated genetic and genomic datasets publicly accessible
for >180 complex diseases. We have built the KPN by developing a protocol for working with
disease experts to aggregate and curate high-confidence genetic datasets, building
computational pipelines to harmonize these data and apply predictive bioinformatic methods
upon them, and extracting relationships mined from these data into a Neo4J graph database.
We propose to use the KPN as a foundation to implement a Translator Knowledge Provider of
high-confidence complex disease associations and predicted disease-relevant molecular and
cellular abnormalities. We will implement this Knowledge Provider by (a) expanding the data
sources, data types, and bioinformatic methods integrated within the KPN; (b) developing new
computational algorithms to improve the ability of genetic data to identify molecular and...

## Key facts

- **NIH application ID:** 10056863
- **Project number:** 1OT2TR003433-01
- **Recipient organization:** BROAD INSTITUTE, INC.
- **Principal Investigator:** Jason Flannick
- **Activity code:** OT2 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $530,030
- **Award type:** 1
- **Project period:** 2020-01-23 → 2024-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10056863

## Citation

> US National Institutes of Health, RePORTER application 10056863, A Complex Disease Genetics Knowledge Provider for Biomedical Data Translator (1OT2TR003433-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10056863. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
