# Next Generation Computational Tools for Functional Genomics

> **NIH NIH R01** · DANA-FARBER CANCER INST · 2020 · $665,458

## Abstract

PROJECT SUMMARY
During the last decade, Next Generation Sequencing (NGS) applications have expanded to include
measurement of dynamic outcomes underlying genomic function in development and disease. Measurements
related to functional elements that act at the protein and RNA levels, and regulatory elements that control gene
activity, are at the core of studies undertaken by large consortia and individual labs alike. These
measurements introduce levels of variability that give rise to data analytic challenges related to distinguishing
unwanted or uninterested sources of variability, from biologically relevant signals. Furthermore, new
technologies and improved data analytic ideas are giving rise to a need for new mapping algorithms to facilitate
deployment on increasingly larger datasets. While existing tools have provided effective ways to process and
analyze data in functional genomics studies, new technologies, more complex biological questions, and the
availability of increasingly complete datasets are posing new challenges. Single cell RNA-seq and single cell
ATAC-seq technologies in particular have introduced complexities that current tools are not optimized to
address.
Our team has extensive experience developing computational tools and statistical methodology for functional
genomics, disseminated as open source software. Many of our methods have become standards among users
of high-throughput technologies and are commonly included as part of standard pipelines. Combined, these
software packages receive hundreds of thousands of downloads each year and the papers describing the
methods have been cited tens of thousands of times. Furthermore, Dr. Irizarry (PI) is a leader in the
Bioconductor project, one of the most widely used open-source projects for the analysis of high-throughput
genomics data which has greatly facilitated the development and dissemination of our and others
state-of-the-art statistical methodologies.
We have identified three specific computational challenges urgently requiring new or improved solutions that
can greatly benefit from our expertise. Namely, we propose to develop: fast and accurate read mapping
specialized for count-focused sequencing data; develop a unified statistical approach for normalization and
downstream analysis​; ​developing computational tools to integrate scATAC-seq data with scRNA-seq and using
public data to facilitate annotation and functional interpretation. We plan to disseminate our tools via open
source software and provide a user friendly suite of packages that functional genomics researchers can use to
extract knowledge from their single cell RNA-seq or ATAC-seq data.

## Key facts

- **NIH application ID:** 9979396
- **Project number:** 1R01HG011139-01
- **Recipient organization:** DANA-FARBER CANCER INST
- **Principal Investigator:** Rafael Angel Irizarry
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $665,458
- **Award type:** 1
- **Project period:** 2020-09-22 → 2025-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9979396

## Citation

> US National Institutes of Health, RePORTER application 9979396, Next Generation Computational Tools for Functional Genomics (1R01HG011139-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9979396. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
