# Advancing Analysis of Multi-omics Data in Alzheimer's Disease Research

> **NIH NIH RF1** · UNIVERSITY OF PENNSYLVANIA · 2020 · $161,981

## Abstract

Project Summary/Abstract:
Alzheimer's disease (AD) is a major public health crisis and a national priority area of high significance. There
is a growing recognition that neurodegeneration and AD are multifactorial that may be attributed to harmful
changes at multiple levels and AD research must confront the challenge of elucidating the disease
mechanisms by leveraging big health data such as -omics data, imaging data, and electronic health records
(EHRs) data. To harness the full power of such rich, yet complex health data, powerful statistical and machine
learning methods have been developed for risk prediction, clinical decision support, and many other important
tasks. However, when applying statistical and machine learning algorithms to such data that are known to
contain sensitive information about individuals, it has been widely investigated and recognized that exploiting
the output of the algorithms, an adversary may be able to identify some individuals in a particular dataset, thus
presenting serious privacy concerns. In addition, there is a growing recognition that powerful statistical and
machine learning methods can unintentionally lead to unfair outcomes for some (marginalized) populations,
defined by say sex, race/ethnicity or age. While there is a growing body of literature on improving fairness of
these algorithms for people across racial, gender and other identities, there has been little work on assessing
the impact of missing data on fairness. Plus, the areas of privacy and fairness have been under-investigated in
AD research. Building on recent work on privacy and fairness, this project seeks to develop and assess
methods for privacy and fairness in analysis of big health data for AD research. Our specific aims are as
follows. In Aim 1, we will refine the state-of-the-art Gaussian Differential Privacy (GDP) method for analysis of
big health data such as –omics data, imaging data, EHRs data using statistical and machine learning
algorithms, and compare its performance with that of existing methods based on (ε, δ)-DP. In Aim 2, we will
assess the impact of missing data on biases in big health datasets and on algorithmic fairness for analysis of
big health data for AD, particularly with respect to protected features such as sex and race/ethnicity. In Aim 3,
we will assess and compare the impact of existing imputation methods on algorithmic fairness for analysis of
big health data for AD, particularly with respect to protected features such as sex and race/ethnicity. This
project is expected to fill significant gaps in privacy and fairness for analysis of big health data for AD research
that have not been investigated before. The results generated from this study will advance methodology for
privacy protection and fairness protection in statistical and machine learning for AD research.

## Key facts

- **NIH application ID:** 10130699
- **Project number:** 3RF1AG063481-01A1S1
- **Recipient organization:** UNIVERSITY OF PENNSYLVANIA
- **Principal Investigator:** Qi Long
- **Activity code:** RF1 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $161,981
- **Award type:** 3
- **Project period:** 2019-08-15 → 2024-03-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10130699

## Citation

> US National Institutes of Health, RePORTER application 10130699, Advancing Analysis of Multi-omics Data in Alzheimer's Disease Research (3RF1AG063481-01A1S1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10130699. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*