Methods for Mapping Genetic Regulatory Elements in Single Cells and Single Molecules

NIH RePORTER · NIH · R01 · $429,241 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY The human genome is regulated through interactions between DNA and proteins in the nucleus that define and maintain the epigenetic state of cells. Therefore, large consortia such as the Encyclopedia of DNA Elements (ENCODE) are dedicated to comprehensively mapping regulatory elements such as transcription factor binding or histone modification so that we may understand regulatory processes that guide development, disease, and the everyday functioning of cells in our body. However, current methods for genome-wide measurement of protein-DNA interactions are unable to map regulatory elements in highly repetitive regions of the genome because they rely on high-throughput, short-read DNA sequencing platforms. This limitation prohibits comprehensive investigation of roughly 8% of the human genome including centromeres and ribosomal DNA arrays, which play essential roles in chromosome segregation and nuclear organization. Furthermore, these methods typically lack the sensitivity to profile the epigenetic landscape of single cells, preventing high-resolution measurements of regulatory variation in complex tissues. The goal of this research program is to expand the toolbox for mapping protein-DNA interactions genome-wide and extend capabilities to long-read sequencing and single-cell sequencing technologies with the development of two methods: (1) Directed methylation and long- read sequencing (DiMeLo-Seq) and (2) single-cell directed methylation and sequencing (scDiMe-Seq). To record the genomic position of protein binding or histone modification, a methyltransferase fused to protein A will be directed to the targeted regulatory element with a primary antibody. Upon activation, the methyltransferase will methylate adenines in proximal DNA sequences. DiMeLo-Seq will implement long-read DNA sequencing technologies such as nanopore sequencing to directly detect the position of these modifications on long molecules of DNA, taking advantage of the differential signal generated by methyl-adenines as they pass through the nanopore. This approach will produce sequencing reads of up to hundreds of kilobases long, providing high- confidence mapping of regulatory elements to regions of the genome that are unmappable with short-read sequencing. To detect these modifications with single-cell sensitivity, scDiMe-Seq will enrich genomic loci containing methyl adenines through targeted digestion, adapter ligation, and PCR amplification. These enriched fragments will then be sequenced using standard high-throughput sequencing. This project aims to develop DiMeLo-Seq and scDiMe-Seq through rigorous protocol optimization of the directed methylation strategy and sequencing library preparation for long and short-read sequencing. The methods will then be characterized and validated by targeting well studied features such as lamina associated domains, and CTCF landscapes, as well as H3K9me3 and CENPA which are both enriched in centromeres. The overall goal of t...

Key facts

NIH application ID
10657351
Project number
5R01HG012383-02
Recipient
UNIVERSITY OF CALIFORNIA BERKELEY
Principal Investigator
Aaron Streets
Activity code
R01
Funding institute
NIH
Fiscal year
2023
Award amount
$429,241
Award type
5
Project period
2022-07-01 → 2026-04-30