# Methods for integrated analysis of multi-level omics data

> **NIH NIH R01** · MASSACHUSETTS GENERAL HOSPITAL · 2022 · $417,569

## Abstract

Project Summary
Novel analytic paradigms allowing for fully integrated interrogation of independent genomics data resources
is expected to reveal substantial new knowledge regarding the mechanistic foundations of genetic associations.
In this proposal we aim to develop, evaluate and apply sound statistical methods for leveraging and integrat-
ing the vast amount of publicly available transcriptome and genomics resources to improve understanding of
the mechanistic relationships among genes and regulatory elements associated with complex traits. Ultimately,
methods for uncovering the molecular and physiological underpinnings of complex diseases will provide clin-
ically relevant impact toward development of novel prognostic markers and therapeutic targets.
The Speciﬁc Aims are to:
(1) Develop a likelihood-based framework for integrated analysis of genomic elements, expression pro-
 ﬁles and phenotypes. An overarching challenge in this setting is that transcriptomics data, composed of
 genotypes and expression proﬁles, and GWA data, composed of genotypes and complex traits, are only
 generally available for independent cohorts. We propose combining these two data resources and framing
 the analysis in terms of a missing data problem. The unobserved expression proﬁles in the GWA data are
 treated as missing and an expectation-maximization (EM) approach is proposed. Methods for efﬁcient
 implementation and inference, as well as an alternative Bayesian MCMC approach, are also described.
(2) Extend the methods of Aim 1 for alternative data structures and types. The framework of Aim 1 will be
 further developed to: (a) account for complex linkage disequilibrium (LD) structures within and across
 genes; (b) address disparities across genotyping platforms; (c) provide for simultaneous investigation of
 multiple cell and tissue compartments, multiple isoforms, and multiple genes and regulatory elements;
 and (d) accommodate time-varying biomarker proﬁles and time-to-event outcomes.
(3) Apply and evaluate performance of the methods developed in Aims 1 and 2. In addition to fully vetting
 the proposed methods and comparing to alternative strategies using extensive simulation studies, we will
 further unravel and elucidate the mechanisms of gene and regulatory element control of complex traits
 using multiple publicly-available reference transcriptome data resources, repeatedly measured biomarker
 data arising from the GENE study, and clinical outcomes from the CRIC study (see Section C).
This application launches from an extensive, decade-long and highly productive trans-disciplinary collabora-
tion. Building on a strong research and mentoring record, the proposed research offers novel statistical research
addressing pressing challenges in precision medicine.

## Key facts

- **NIH application ID:** 10120700
- **Project number:** 5R01GM127862-05
- **Recipient organization:** MASSACHUSETTS GENERAL HOSPITAL
- **Principal Investigator:** Andrea S Foulkes
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $417,569
- **Award type:** 5
- **Project period:** 2018-04-01 → 2023-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10120700

## Citation

> US National Institutes of Health, RePORTER application 10120700, Methods for integrated analysis of multi-level omics data (5R01GM127862-05). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10120700. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
