# Secure and Privacy-preserving Genome-wide and Phenome-wide Association Studies via Intel Software Guard Extensions (SGX)

> **NIH NIH R01** · TRUSTEES OF INDIANA UNIVERSITY · 2022 · $340,558

## Abstract

With the rapid growth of the data volume (e.g., human genomic data) collected in biomedical research,
data protection, in particular for patients’ privacy in secondary uses of these data, has attracted much
attention recently. Today, a vast majority of sensitive biomedical data, including individual human
genomic data and their associated health metadata, are shared only through controlled-access
databases (e.g. dbGaP) and biomedical researchers are required to sign a user agreement before
getting access to these data. Security research has already produced a suite of techniques that can
serve the general purpose of privacy-preserving computation; their direct applications are, however,
too expensive (in terms of resource consumption) for real-world biomedical applications.
An alternative solution is hardware-assisted Trusted Execution Environment (TEE) solutions developed
or being developed by both hardware vendors (Intel, AMD, ARM) and the open-source research
community. A prominent example is Intel’s Software Guard Extension (SGX), which is available as a
feature in Intel's mainstream CPUs (i.e., Skylake and Kaby Lake). In this project, we plan to explore
potential applications of TEE to two popular genome computation tasks involving sensitive biomedical
data, i.e., the genome-wide and phenome-wide association studies. For GWAS, a secondary research
user may collect genomic sequences (in encrypted form) with (cases) or without (controls) a disease
phenotype from multiple data owners, on which association tests or advanced GWAS algorithms can
be conducted within the SGX enclave. Similarly, for PheWAS, a user may collect phenotype data from
individuals whose genomes containing (case) or not containing (control) one or more specific variations.
We will address two issues when developing these approaches: 1) we will customize GWAS/PheWAS
algorithms for efficient execution in the TEE with limited resources (e.g, memory, I/O, etc), and 2) we
will develop new genome computing outsourcing and data sharing platforms suing the SGX techniques,
and further understand and mitigate its potential side-channel risks with regards to GWAS/PheWAS
computing tasks. The proposed research will lead to a practical solution for secure GWAS and PheWAS
in three application scenarios: 1) secure outsourcing: a research institution collects matched genomic
and phenotypic data from a large cohort of case and control individuals, and outsources the storage of
these data and potential repeated GWAS and PheWAS computation to a public or commercial cloud;
2) secure collaboration: a consortium of researchers across multiple institutions attempt to collaborate
on a large GWAS/PheWAS study using the data collected by each participating institution; and 3)
secure data sharing: researchers want to share their data with a broad biomedical research community
so that potential data users may conduct a secondary GWAS/PheWAS analysis.

## Key facts

- **NIH application ID:** 10470341
- **Project number:** 5R01HG010798-04
- **Recipient organization:** TRUSTEES OF INDIANA UNIVERSITY
- **Principal Investigator:** HAIXU TANG
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $340,558
- **Award type:** 5
- **Project period:** 2019-08-09 → 2024-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10470341

## Citation

> US National Institutes of Health, RePORTER application 10470341, Secure and Privacy-preserving Genome-wide and Phenome-wide Association Studies via Intel Software Guard Extensions (SGX) (5R01HG010798-04). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/10470341. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
