# Connecting transposable elements and regulatory innovation using ENCODE data

> **NIH NIH U01** · WASHINGTON UNIVERSITY · 2021 · $464,997

## Abstract

PROJECT SUMMARY
 Repetitive transposable elements (TEs) comprise over 50% of the human genome. While some
investigators regard TEs as “parasitic” DNA, other studies suggest that TEs play a more constructive role in
genome evolution by providing raw material for new biological functions. For example, TEs commonly
harbor active cis-regulatory elements that are occasionally co-opted during evolution to wire new gene
regulatory networks. While investigators now recognize the importance of TEs in gene regulation, TEs
remain under-analyzed in high-throughput data because of methodological hurdles associated with their
repetitive nature. Thus, the impact of TEs on the regulation of the human genome, both in normal
development and disease, remains largely uncharacterized. We propose to develop novel computational
methods to assess and clarify the impact of TEs in regulatory innovation using ENCODE data. In
Specific Aim 1 we will develop new algorithms and statistical methods to predict active regulatory elements
encoded by TEs from heterogeneous ENCODE data. If successful, we will generate a profile of TE-derived
regulatory elements and their predicted targets across diverse cell/tissue types and developmental stages,
revealing new gene regulatory networks wired by TEs. With these new methods we also intend to examine
the extent of TE dysregulation in cancer cells and its transcriptional consequences. In Specific Aim 2 we
will extend the models developed in Aim 1 to understand the role of TEs in shaping the 3D topology of the
genome, which is intimately connected to genome function. We will investigate the role of TEs in partitioning
the genome into chromosomal domains that orchestrate communication between cis-regulatory elements
and their target genes. In particular, we will quantify the extent to which TEs drive conservation and
divergence in genome topology across mammal species. In Specific Aim 3 we will take advantage of the
repetitive nature of TEs to develop a novel statistical model that links sequence changes in different copies
of TEs to epigenetic and functional differences. The numerous, but slightly different copies of a TE present
in a single genome provide a unique opportunity to identify sequence variants that underlie epigenetic
modification, which will further our understanding of how TEs become co-opted for host gene regulation.
Finally, in Specific Aim 4, we will deploy our recently developed Repeat Element Browser as a web portal
and downloadable application specifically tailored for investigators to analyze, visualize and explore data
produced by ENCODE, others, and their own data in the context of TEs. The methods developed in this
proposal will have a high impact on the utility of the data produced by ENCODE and will greatly expand our
understanding of the contribution of TEs to non-coding regulatory elements in healthy tissues and disease.

## Key facts

- **NIH application ID:** 10241106
- **Project number:** 3U01HG009391-04S1
- **Recipient organization:** WASHINGTON UNIVERSITY
- **Principal Investigator:** Barak A Cohen
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $464,997
- **Award type:** 3
- **Project period:** 2020-09-01 → 2023-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10241106

## Citation

> US National Institutes of Health, RePORTER application 10241106, Connecting transposable elements and regulatory innovation using ENCODE data (3U01HG009391-04S1). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10241106. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
