# Computational epigenetics modeling of cell identity genes

> **NIH NIH R01** · BOSTON CHILDREN'S HOSPITAL · 2021 · $374,170

## Abstract

Cell identity genes are a group of functionally linked genes that jointly implement the phenotype of a given cell
type. A major constraint on cell identity study is the lack of a robust method to define the catalogue of identity
genes for a cell type, and to identify master transcription factors that regulate the expression network of cell
identity genes and drive cell identity specification.
Intrigued by our recent discoveries, we hypothesize that cell identity genes can be identified using epigenetic
feature that manifests their distinct transcriptional regulation mechanism. We and several other groups
discovered that cell identity genes display unique epigenetic features, e.g., broad H3K4me3 (Chen, et al,
Nature Genetics, 2015) and super-enhancers. We illustrated that these features are associated with strong and
stable transcription activation signals for cell identity genes in their associated cell type, but not in other cell
types. Biologists have used super enhancers or broad H3K4me3 as makers to nominate cell identity genes
recently. However, it is still challenging for most biologists to use this method, as the required bioinformatics
tools are not yet available.
Our overall goal in this proposal is to extend the development of our computational epigenetic methods for cell
identity gene discovery. Leveraging the early success of our bioinformatics algorithms DANPOS (Chen, et al,
Genome Research, 2013) and DANPOS2 (Chen, et al, Nature Genetics, 2015), we will develop a series of
new algorithms to (1) define epigenetic features for cell identity genes, (2) customize parameters for ChIP-Seq
analysis of epigenetic feature, (3) collect known cell identity genes on the basis of thorough literature search
followed by manual inspection, (4) systematically identify unknown cell identity genes, and (5) define master
transcription factors that regulate the network of cell identity genes and drive cell identity specification. As a
proof of principle, we will apply our novel methods to study cell identity determinants for the ECs in
collaboration with Drs. John P. Cooke, Longhou Fang, and Qi Cao, three experts in EC biology, angiogenesis,
and epigenetics.
Successful completion of this study is expected to have broad positive impact on the study of cell identity
determination, transcriptional regulation, and chromatin epigenetics. The scientific community will be able to
use the bioinformatics tools developed in this proposal to define histone modification features with improved
accuracy, and to predict identity genes and their master transcription factors systematically for given cell types
in numerous biological systems or disease models. Our functional assay for new identity genes of ECs will
improve mechanistic understanding of endothelial differentiation, development, and phenotypes, and will better
guide discovery of therapeutic targets for treatment of vascular diseases. Although we focus on histone
modification features for EC identity genes, ...

## Key facts

- **NIH application ID:** 10140377
- **Project number:** 5R01GM125632-05
- **Recipient organization:** BOSTON CHILDREN'S HOSPITAL
- **Principal Investigator:** Kaifu Chen
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $374,170
- **Award type:** 5
- **Project period:** 2018-07-01 → 2023-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10140377

## Citation

> US National Institutes of Health, RePORTER application 10140377, Computational epigenetics modeling of cell identity genes (5R01GM125632-05). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10140377. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
