# Privacy-preserving genomic medicine at scale

> **NIH NIH R01** · MASSACHUSETTS INSTITUTE OF TECHNOLOGY · 2020 · $636,185

## Abstract

1 Project Summary
 2
 3 High-throughput sequencing, biomedical imaging, and electronic health record technologies are
4 generating health-related datasets of unprecedented scale. Integrative analysis of these
 5 resources promises to reveal new biology and drive personal and precision medicine. Yet, the
 6 sensitive nature of these data often requires that they be kept in isolated silos, limiting their
7 usefulness to science. The goal of this project is to develop innovative privacy-preserving
 8 algorithms to enable data sharing and drive genomic medicine. Crucially, we will draw upon our
 9 past success in secure genome analysis and algorithmic expertise in computational biology to
10 address the imminent need to perform complex integrative analyses securely and at scale.
11 Current privacy-preserving tools are prohibitively too costly to perform the complex
12 calculations required in genomic analysis. We previously leveraged the highly structured nature
13 of biological data and novel optimization strategies to implement efficient pipelines for secure
14 genome-wide association studies (GWAS) and drug interaction predictions which scaled to
15 millions of samples. In this project, we will further exploit the unique properties of biomedical data
16 to: (i) develop secure integrative analysis methods for genomic medicine; (ii) develop an easy-to-
17 use programming environment with advanced automated optimizations to facilitate the adoption
18 of privacy-preserving analyses; and (iii) promote the use of our privacy techniques to gain novel
19 biological insights through large-scale collaborative genetic studies of multi-ethnic cohorts.
20 With co-I’s Amarasinghe (MIT) and Cho (Broad Institute), we aim to apply these tools to
21 realize the first multi-institution, multi-national secure genetic studies with our partners at the
22 Swiss Personalized Health Network, UK Biobank, Finnish FinnGen, All of Us, NIH NCBI, Broad
23 and Barcelona Supercomputing Center (Letters of Support). We will also use our privacy-
24 preserving approaches to study genomic origins of polygenic traits for disease as well as
25 neuroimaging and other clinical phenotypes. We will continue to actively integrate our methods
26 into community standards (MPEG-G, GA4GH).
27 Successful completion of these aims will result in computational methods and open-source,
28 easy-to-use, production-grade implementations that open the door to secure integration and
29 analysis of massive sets of sensitive genomic and clinical data. With input from our collaborations,
30 we will build these tools and apply them to better understand the molecular causes of human
31 health and its translation to the clinic.

## Key facts

- **NIH application ID:** 9998648
- **Project number:** 1R01HG010959-01A1
- **Recipient organization:** MASSACHUSETTS INSTITUTE OF TECHNOLOGY
- **Principal Investigator:** BONNIE BERGER
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $636,185
- **Award type:** 1
- **Project period:** 2020-09-18 → 2024-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9998648

## Citation

> US National Institutes of Health, RePORTER application 9998648, Privacy-preserving genomic medicine at scale (1R01HG010959-01A1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9998648. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
