Coupling a multifunctional tag to scalable endogenous tagging technology for improved genome-wide perturbation screens

NIH RePORTER · NIH · F31 · $34,179 · view on reporter.nih.gov ↗

Abstract

Project Summary Characterizing the functions of protein-coding genes is an important goal in the post-genomic era. While proteins are the ultimate effectors of most cellular functions, including those mis-regulated in disease, we have an extremely limited understanding of the roles of the majority of proteins in the human proteome. Though powerful, existing technologies for the high-throughput interrogation of protein-coding genes, including CRISPR/Cas9-based approaches and RNA interference, require extended periods of time to effect changes in protein levels, and thus suffer two critical shortcomings. First, they are unable to detect the contribution of growth- essential genes to any cellular process other than viability, as any cell carrying a perturbation in such a gene would fail to propagate. Second, compensatory and adaptive effects have ample opportunity to manifest, thus convoluting screen results by ameliorating the effect of the perturbation, or by generating a novel, unrelated effect. To address these critical limitations, I propose to develop a new screening technology that will minimize the time between perturbation and screen readout by inducibly and rapidly degrading endogenous proteins. This is made possible by a readily scalable endogenous tagging technology that harnesses homology-independent targeted integration to insert a synthetic exon into the intron of a protein-coding gene at the site of a double strand break. The synthetic exon will encode a multifunctional ligand-binding protein that depending on the ligand, will lead to fluorescence or rapid degradation. Pooled libraries of sgRNAs targeting different introns allows for the creation of custom libraries of cells, where each cell carries this multifunctional tag on a different protein. The utility of this approach will be established aims 1 and 2 by testing (1) whether cells that have undergone rapid depletion of growth-essential proteins are maintained in the cell library at the end of the short perturbation window and (2) whether rapid depletion and CRISPR knockout at the same protein produce different effects on a well-established phenotype, due to the distorting effects of adaptation events in the knockout. Aim 3 witnesses the use of a machine learning approach and the data from thousands of attempted tagging events to identify how the features of a potential tag site dictate the likelihood that a functional protein carrying the multifunctional tag will be produced. The resulting model will be unleashed on the protein-coding genome to predict high-quality tag sites for as many protein-coding genes as possible. This will establish an improved screening paradigm that will allow for the pooled interrogation of the contributions of thousands of proteins to a phenotype of interest, will thus accelerate the rate at which we come to understand the poorly understood elements of the protein-coding genome. These efforts will be well supported by the outstanding resources for e...

Key facts

NIH application ID
10424561
Project number
5F31HG011185-03
Recipient
UNIVERSITY OF PENNSYLVANIA
Principal Investigator
Stephanie Elizabeth Sansbury
Activity code
F31
Funding institute
NIH
Fiscal year
2022
Award amount
$34,179
Award type
5
Project period
2020-07-01 → 2023-06-30