# Use of a Machine Learning Approach to Impute Gene Expression in African Americans

> **NIH NIH R21** · NORTHWESTERN UNIVERSITY · 2022 · $200,000

## Abstract

PROJECT SUMMARY
Multi-omics data has been invaluable in understanding the potential mechanisms behind SNP associations.
Using paired genomic and transcriptomic data allows investigators to determine the tissue specific effects of
non-coding variation. However, most of this type of data exists for mostly European ancestry populations.
Linear models have been developed which that can impute gene expression from genotype data  mostly
created from the GTEx resource. This resource contains paired genotype and gene expression data on 44
human tissues. Unfortunately, these models are built mostly on European data; they do not perform as well on
African American (AA) cohorts. To alleviate this disparity in both knowledge and data we are proposing to use
both or own African American paired data as well as public African American data to create linear and machine
learning models to impute gene expression. We will then assess the utility of these models in predicting the
risk on venous thromboembolism in our ACCOuNT cohort. By building on our current knowledge of
transcriptome imputation, we will be advancing these methods to understudies admixed populations.

## Key facts

- **NIH application ID:** 10426288
- **Project number:** 5R21HG011695-02
- **Recipient organization:** NORTHWESTERN UNIVERSITY
- **Principal Investigator:** Minoli A Perera
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $200,000
- **Award type:** 5
- **Project period:** 2021-06-10 → 2024-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10426288

## Citation

> US National Institutes of Health, RePORTER application 10426288, Use of a Machine Learning Approach to Impute Gene Expression in African Americans (5R21HG011695-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10426288. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
