# The Human DNA virome: from petabase scale to single-cell resolution

> **NIH NIH U01** · SLOAN-KETTERING INST CAN RESEARCH · 2024 · $1,061,833

## Abstract

PROJECT SUMMARY
 At the turn of the millennium, the cost of sequencing one megabase of DNA exceeded 10 million
dollars. Today, decoding that same megabase costs less than a penny. This dramatic reduction in the cost of
sequencing has catalyzed a genomics revolution and resulted in petabases (millions of billions of bases) of
DNA and RNA- sequences from human cells and tissues. We hypothesize that this transformational capacity of
next-generation sequencing will now catalyze our understanding of the human virome—both by illuminating the
set of DNA viruses in human tissues and by profiling the host (human) cells that are infected.
 Here, we outline new computational and experimental methods to realize the unique potential of
petabase-scale sequencing data in studying human DNA viruses. Our work will uncover foundational aspects
of the human virome, including tissue tropism and cellular reservoirs for all DNA viruses. Further, we present
complementary strategies through host genetics analyses and single-cell multi-omics to define and
characterize the molecular interactions of human cells associated with viral infections and latency.
 First, in Aim 1, we will develop methods to quantify latent viral features from petabases of unmapped
whole genome sequencing reads from hundreds of thousands of individuals. These new molecular variables
will reveal the degree of latent viral DNA in blood and will be paired with comprehensive host genotyping and
phenotyping. We will determine host genetics factors associated with high viral levels and nominate
phenotypes, including complex disease, that may be driven by long-term latent infection in individuals.
 In Aim 2, we will extend our petabase-scale resource (Serratus) to create a ‘Digital Human Virome’ by
uniformly processing billions of dollars of public sequencing data from human cells and tissues to identify and
quantify all DNA viruses. Our resource will aggregate meta-data to extract sex, cell/tissue of origin, disease
status, and geographic location to create a Digital Human Virome for DNA viruses, revealing tissue tropism
that can be mined for clinical associations, such as our recent discovery of HHV-6 reactivation in CAR T cells.
 Finally, in Aim 3, we will develop a new high-throughput single-cell multi-omics technology termed
‘Latent-seq’ that will identify individual human cells that harbor latent viruses with paired high-quality cell state
measurements. We will first establish and benchmark the assay using a set of well-defined cell lines before
extending applications to primary human tissues in collaboration with the Human Virome Program Consortium.
 Together, these workflows will define human DNA virome in health and disease by leveraging this
unique moment in the capacity of genomics technologies. As every human gets infected by endemic eukaryotic
DNA viruses, but only some individuals ever show symptoms, our systematic approaches will uncover new
associations between molecular interactions and ...

## Key facts

- **NIH application ID:** 10987464
- **Project number:** 1U01AT012984-01
- **Recipient organization:** SLOAN-KETTERING INST CAN RESEARCH
- **Principal Investigator:** Caleb Andrew Lareau
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $1,061,833
- **Award type:** 1
- **Project period:** 2024-09-20 → 2029-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10987464

## Citation

> US National Institutes of Health, RePORTER application 10987464, The Human DNA virome: from petabase scale to single-cell resolution (1U01AT012984-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10987464. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
