# Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation

> **NIH NIH UM1** · BROAD INSTITUTE, INC. · 2021 · $1,496,338

## Abstract

Summary
The ENCODE project has generated comprehensive maps of cis-regulatory elements (CREs) controlling the
transcription of genes within the human genome. These maps have been crucial in our efforts to understand
sequence variants linked to human traits and disease, as the majority of these variants are non-coding
regulatory changes rather than amino acid substitutions. However, even though we know the locations of
thousands of CREs, our understanding of how they operate is derived from a relatively small set of
well-described examples. Therefore, we plan to directly characterize the function of ENCODE CREs at a
genome-wide scale in multiple cell-types. This will transition the field of functional genomics from a simple map
of regulatory elements towards a deep understanding of the fundamental rules governing regulatory logic down
to the basepair resolution. Achieving this will dramatically expand ENCODE’s utility by strengthening our ability
to interpret the effects of natural human variation on gene regulation.
We propose to directly measure regulatory activity of over 3% of the genome, pursuing loci highlighted as
important by ENCODE and other functional data. We will first apply computational methods to identify the most
biologically informative CREs, representing a diversity of regulatory logic and architecture, and will use
machine learning techniques to prioritize functional variants for characterization relevant to common and rare
human diseases, traits, and adaptation. Of these we will select 100,000 CREs and 375,000 variants,
representing ~100 Mb of genomic sequence, and characterize them using the massively parallel reporter
assay (MPRA) to understand each element’s regulatory activity. Then, to complement data from the MPRA, we
will characterize additional 1 Mb regions across 20 loci using CRISPR-based non-coding screens to build a
comprehensive picture of these loci. This strategy leverages the throughput and flexibility of MPRA while
maintaining the connectivity of regulatory logic in the CRISPR-based screens, which perturb elements within
their endogenous genomic context. This will help us judge the accuracy and completeness of ENCODE, while
also providing data from both approaches to address a wide-variety of research questions. These methods are
difficult to apply to disease relevant primary cells at full scale, but we will use the results of our MPRA and
CRISPR screens to inform our models and better predict the fundamental rules of regulatory logic. We will then
construct smaller, targeted libraries to test disease-specific variants in primary cells and use assays specific for
each of three autoimmune diseases: type 1 diabetes, inflammatory bowel disease, and lupus.
This approach will inform the research community on the rules governing the activity of the CREs mapped by
the ENCODE project, and will simultaneously provide concrete information about the function of hundreds of
thousands of sequence variants relevant for human...

## Key facts

- **NIH application ID:** 10241056
- **Project number:** 3UM1HG009435-04S1
- **Recipient organization:** BROAD INSTITUTE, INC.
- **Principal Investigator:** Pardis Christine Sabeti
- **Activity code:** UM1 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $1,496,338
- **Award type:** 3
- **Project period:** 2017-09-12 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10241056

## Citation

> US National Institutes of Health, RePORTER application 10241056, Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation (3UM1HG009435-04S1). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10241056. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
