Function-based exploration of genetic variation at genome-scale

NIH RePORTER · NIH · R01 · $786,893 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Genome-wide association studies have discovered thousands of genetic variants associated with phenotypic traits such as disease risk. Most of the associated variation lies within non-coding regions of the genome and the causative effects of those variants remain largely unknown. The sparsity of knowledge on interactions between the coding and non-coding regulatory parts of the genome makes the prediction of variant function solely from genome sequence and location impossible. We propose to experimentally uncover the functional relevance of genetic variants at a large scale, by perturbing variants and genetic elements containing variants, and reading out the direct consequences of those perturbations on gene regulation. To this end, we propose to apply our recently developed CRISPR/Cas9 functional genomics screening technology with targeted single-cell transcriptomic readouts (targeted Perturb-seq or TAP-Seq in short) to enable systematic interrogation of non- coding regions and genetic variation therein. First, we will apply our targeted Perturb-seq to decipher the regulatory circuitry encoded on an entire human chromosome by systematically perturbing all major genetic elements (enhancers, protein-coding and lncRNA genes). This extensive data set will enable to decipher the complex regulatory networks controlling gene expression on the selected chromosome. Next, we will uncover causal regulatory variants in these regions by coupling high-throughput precision genome editing to simultaneous single-cell genomic and transcriptomic readout. Using this novel approach, we will be able to decipher the functional impact of genetic variants on gene expression and derive rules by which genetic variation perturbs gene regulatory processes. We will integrate the generated data with available functional genomics data, such as transcription factor binding (ChIP-seq), chromatin accessibility (ATAC-seq, DNAse-seq) and interactions in 3D (Hi-C), in order to train machine learning models to derive rules of the observed regulatory interactions. These models will be applied to decipher the molecular mechanisms underlying the regulatory logic, and to predict regulatory interactions and variants throughout the genome and across cell types. Selected predictions will be experimentally validated using the established perturbation technologies, to verify clinically relevant predictions and improve the performance of the predictive models. Taken together, this project will answer fundamental questions in gene regulation, uncover the mechanisms by which genetic variation impacts gene expression, and create datasets and computational models as valuable tools for interpreting results from GWAS, eQTL and clinical genomic studies.

Key facts

NIH application ID: 10367604
Project number: 1R01HG011664-01A1
Recipient: STANFORD UNIVERSITY
Principal Investigator: Lars M Steinmetz
Activity code: R01
Funding institute: NIH
Fiscal year: 2022
Award amount: $786,893
Award type: 1
Project period: 2022-09-09 → 2026-06-30