Direct Determination of Multiple Specific Forms of DNA Chemical Modifications in Human Genome

NIH RePORTER · NIH · R56 · $233,456 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT The information content of DNA is not limited to the primary sequence (A, C, G, T), but is also conveyed by chemical modifications of individual bases. For example, DNA methylation, specifically 5-methylcytosine (5mC), has been widely studied for its important regulatory roles in human development and diseases. In addition, the discovery of active demethylation of 5mC, mediated by TET enzymes, into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) revealed great insights into the dynamic nature of the human methylome and its close relevance to multiple human diseases. Beyond these chemical modifications to cytosine, recent studies by us and others discovered that N6-methyladenine (6mA), another form of methylation previously thought exclusively existing in bacteria and protozoa, also exists in eukaryotic genomes including the human genome. In addition to these epigenetic marks, different forms of DNA damages represent another category of DNA chemical modifications that are of important biological relevance. Although a few methods for mapping individual chemical modifications have been developed and some are widely used, it is usually hard for broad researchers to master every protocol to map each form of modification. While third- generation sequencing technologies support the direct detection of DNA modifications, they face fundamental challenges distinguishing among different forms of modifications. The objective of this project is to develop a novel technology for the direct mapping of multiple forms of DNA methylation and DNA damage events simultaneously. The core idea is that each form of nucleic acid modification has a unique signature in terms of their physical interaction with DNA polymerase, or nanopores in third-generation sequencing; and these signatures can be modeled by deep learning methods. We will develop this technology using multiple innovative strategies to address a few fundamental challenges, and then comprehensively evaluate the technology to facilitate broad applications.

Key facts

NIH application ID
10267380
Project number
1R56HG011095-01
Recipient
ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI
Principal Investigator
Gang Fang
Activity code
R56
Funding institute
NIH
Fiscal year
2020
Award amount
$233,456
Award type
1
Project period
2020-09-25 → 2021-04-30