Identifying Biomarkers from Multi-source, Multi-way Data

NIH RePORTER · NIH · R01 · $299,108 · view on reporter.nih.gov ↗

Abstract

Project Summary In medical research, a growing number of high-content platforms and technologies are used to measure di- verse but related information. Examples include sequencing of the genome, epigenome, transcriptome and translatome, metabolite profiling, and imaging modalities. Moreover, data from the same high-content platform are often measured over multiple dimensions, such as multiple tissues, body regions, or developmental time points. We refer to data measured over multiple platforms or technologies as multi-source, and data measured over multiple dimensions as multi-way. Many modern biomedical studies collect data that are both multi-source and multi-way, meaning multi-way data are collected from multiple platforms. Multi-source multi-way data has enormous potential to capture and synthesize every facet of a complex biological system. However, to date there has been little methodology developed for fully integrative analysis of such data. We will focus on devel- oping methods to identify biomarkers for a clinical outcome from multi-source multi-way data. Biomarkers are often used as a surrogate for disease progression or as an endpoint for clinical trials, and so their precision in capturing a given medical phenomenon is crucial. We propose to develop new composite biomarker meth- ods that identify patterns across multiple sources of data, and multiple dimensions, that are associated with a clinical outcome. Our central hypothesis is that a fully integrated and multivariate approach will yield more precise biomarkers and simplify their interpretation. The novel product of this project will be a suite of methods extending common biomarker tasks to the multi-source multi-way context, including dimension reduction (Aim 1a), missing value imputation (Aim 1b), high-dimensional prediction (Aim 2) and dependent hypothesis testing (Aim 3). This work is motivated by our involvement in several ongoing collaborative translational projects with rich multi-source multi-way data, including biomarker discovery for the development of lung cancer in chronic obstructive pulmonary disease patients, for the progression of neurodegenerative disorders such as Friedre- ich's Ataxia, and for brain iron deficiency in infants. We will apply and rigorously assess our multi-source multi-way approaches on these applications. All methods will be implemented in free, open-source and easily accessible software to facilitate their use by other researchers and practitioners.

Key facts

NIH application ID
10063530
Project number
5R01GM130622-03
Recipient
UNIVERSITY OF MINNESOTA
Principal Investigator
Eric F Lock
Activity code
R01
Funding institute
NIH
Fiscal year
2021
Award amount
$299,108
Award type
5
Project period
2019-03-01 → 2023-11-30