# Tensor Array Methods for RNA-Seq Analysis

> **NIH NIH R01** · UNIVERSITY OF MICHIGAN AT ANN ARBOR · 2022 · $336,005

## Abstract

PROJECT SUMMARY
RNA-Sequencing (RNA-Seq) analysis provides a critical means to understand gene functions. High-throughput
RNA-Seq data are frequently measured under multiple conditions from the same set of samples. For example,
in the NIH Common Fund’s Genotype-Tissue Expression (GTEx) project, samples from different tissues are
collected from each post-mortem donor for sequencing. For another study on ultraviolet (UV) radiation, skin
keratinocytes from the same set of subjects are exposed to different radiation doses and durations before
sequencing. Such common-sample, multi-condition RNA-Seq data have information shared across both
samples and conditions, and have the potential to provide key insights into gene functions. However, despite
great endeavors to collect such data, there is a lack of analytical methods and computational tools to maximize
their potential. Important tasks such as missing data imputation, functional gene module identification and
association analysis remain unaddressed. In this proposal, we will build an innovative and powerful paradigm
to analyze multi-condition RNA-Seq data and thus improve our understanding of gene functions. To leverage
information across conditions, samples and genes simultaneously, we propose to model RNA-Seq data as
multi-way tensor arrays. We will develop novel tensor methods and theory that are appropriate for read count
data. In particular, our first aim is to extend tensor completion methods for block-wise missing RNA-Seq data
imputation. By modeling unobserved samples as missing blocks in a tensor, we will aggregate information
along different modes (subjects, conditions, genes) to impute missing values. The second aim develops
flexible tensor co-clustering methods, which simultaneously cluster genes, samples and conditions, for co-
expressed gene module identification. The third aim is to build new tensor response regression models to
associate gene modules with genotype and covariates which will provide insights into genetic regulation such
as expression quantitative trait loci (eQTL). Finally, in the fourth aim, we will develop scalable statistical
software to implement the proposed methods and make them more broadly applicable. We will apply the
methods to the GTEx multi-tissue data and UV multi-condition data, and gain novel insights into gene
expression and regulation. The proposed research will likely transform how we analyze multi-condition RNA-
Seq data and enhance our understanding of human genomics and its relation to public health.

## Key facts

- **NIH application ID:** 10356949
- **Project number:** 5R01HG010731-04
- **Recipient organization:** UNIVERSITY OF MICHIGAN AT ANN ARBOR
- **Principal Investigator:** Gen Li
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $336,005
- **Award type:** 5
- **Project period:** 2020-08-31 → 2025-02-28

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10356949

## Citation

> US National Institutes of Health, RePORTER application 10356949, Tensor Array Methods for RNA-Seq Analysis (5R01HG010731-04). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10356949. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
