# Methods for single-cell CRISPR screens and multiomic data: constructing powerful well-calibrated tests, circumventing unmeasured confounding, and accounting for denoising and imputation

> **NIH NIH R01** · CARNEGIE-MELLON UNIVERSITY · 2024 · $592,551

## Abstract

Project summary/abstract
In this application, we request continuation of MH123184, which aimed to understand how
genetic variation alters transcription in specific cells and thereby produces psychopathology. Our research
developed statistical methods to integrate single cell and tissue-level transcriptomic data. We targeted
methods to identify gene communities, defined in terms of cell type and spatiotemporal window, to
understand how genes act in concert to confer risk for psychopathology. We also took advantage of an
exciting new avenue of research to approach these challenges, namely CRISPR screening. This innovation
has emerged as a powerful tool to characterize the effects of genetic perturbations on the entire
transcriptome at a single-cell level. Here we propose research covering three related themes, all of which
capitalize on CRISPR advancements: (1) develop powerful and well-calibrated tests for the effect of CRISPR
perturbations on gene expression by inferring latent factors; (2) develop methods for removing the effect of
unmeasured confounders in high throughput screens; and (3) develop methods for imputation and
denoising for multiomic data that facilitate downstream testing of omic readouts. Each of these aims is
motivated by pressing needs in the field. First, due to small samples and the sparsity of the response
variable, it is essential that we enhance the power and interpretability of CRISPR tests by accounting for co-
regulation and convergent function of genes. Aim 1 achieves this purpose by estimating latent factors that
represent co-regulated genes and by inferring a similarity matrix among gene perturbations. As CRISPR
screens advance to more biologically complex settings, such as model organisms, unmeasured confounders
will play a more important role, and new methods are needed to control for these effects. Aim 2 develops
two approaches to this challenge: an innovative use of negative control variables, as motivated by the
causal literature, and key advances to the classic surrogate variable analysis method. For the field to move
toward efficient use of multiomic data, data derived from multiple sources will be required. These
resources will invariably have missing data. Methods to account for imputation of missing data are needed.
Tools developed for variational autoencoders show great promise; however, as described in Aim 3, they
need to be paired with semiparametric inference tools to ensure robust and well calibrated downstream
analysis. By applying what we learn from these three aims to available resources, most from distributed
resources and some from our collaborations, we expect to shed more light on the neurobiological
mechanisms of mental illness. We are well positioned to move between theory and data because
we have a diverse team of investigators lead by the PI (Roeder), who has decades of experience
in statistical genomic field and co-investigators Wasserman and Lei, who are experts in theory and
methods for h...

## Key facts

- **NIH application ID:** 10880802
- **Project number:** 2R01MH123184-05
- **Recipient organization:** CARNEGIE-MELLON UNIVERSITY
- **Principal Investigator:** KATHRYN M ROEDER
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $592,551
- **Award type:** 2
- **Project period:** 2020-05-01 → 2029-02-28

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10880802

## Citation

> US National Institutes of Health, RePORTER application 10880802, Methods for single-cell CRISPR screens and multiomic data: constructing powerful well-calibrated tests, circumventing unmeasured confounding, and accounting for denoising and imputation (2R01MH123184-05). Retrieved via AI Analytics 2026-06-12 from https://api.ai-analytics.org/grant/nih/10880802. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
