Interpretable Deep Learning Methods to Investigate Genetics and Epigenetics of Alzheimer's Disease at a Single-Cell Resolution

NIH RePORTER · NIH · R01 · $634,315 · view on reporter.nih.gov ↗

Abstract

Alzheimer's disease and related dementias (ADRDs) are complex multifactorial disorders characterized by progressive memory loss, confusion, and impaired cognitive abilities in older adults. In addition to genetic variants, studies have reported that certain epigenetic, network, and genome organizational perturbations, and their complex interplay, contribute to ADRD progression, informing new cellular etiologies. The recent single-cell revolution, especially multimodal genomic profiling, makes it possible to scrutinize multi-scale dysregulations in ADRDs at the finest possible resolution. However, few methods have been developed to address this critical yet challenging task due to the high missingness, dimensionality, and complex feature interactions in single-cell data. In this project, we will develop interpretable deep learning methods and software tools to highlight multi-scale dysregulations contributing to ADRDs, including genetic, epigenetic, network, and chromatin structural alterations at a single-cell resolution. Distinct from previous efforts reporting a set of one-dimensional (1D) functional cis-regulatory elements (CREs) from only one genome and applying it to all samples, we aim to construct personal, compact, gene-centric, and cell-type-specific brain regulome from sc-multiome data. Specifically, we will first propose a scalable multimodal deep generative model to integrate large-scale, heterogeneous ADRD single-cell data with single-, multi-, and hybrid modalities. Distinct to existing methods, we will include an invariant representation learning scheme to derive latent cell representations uncorrelated with confounding factors (e.g., age, gender, read depth, and batch effects) for bias-free transcriptome and epigenome reconstruction (Aim 1). Then, we will go beyond the 1D genome annotation by deciphering the multi-scale gene regulation code (Aim 2), including cell-type- specific chromatin compartmentation, CREs and their target genes for functional interpretation, and transcription factor (TF) regulatory networks (TRNs). Lastly, we will develop interpretable deep learning models to link multi-scale dysregulations to ADRD with mechanistic explanation (Aim 3). This proposal is built on an existing multi-year collaboration among the Zhang, Won, and Gerstein labs that originated from the ENCODE and PsychENCODE projects, with diverse expertise in computer science, neuroscience, and genomics. Upon completion, our proposal will significantly accelerate research in a broader scientific community by providing essential tools to investigate functional regions in the genome and prioritize multi-scale risk factors for ADRD.

Key facts

NIH application ID: 10698166
Project number: 5R01NS128523-02
Recipient: UNIVERSITY OF CALIFORNIA-IRVINE
Principal Investigator: JING ZHANG
Activity code: R01
Funding institute: NIH
Fiscal year: 2023
Award amount: $634,315
Award type: 5
Project period: 2022-08-30 → 2027-07-31