# Inference and application of graphs for genomic data

> **NIH NIH R35** · UNIVERSITY OF CALIFORNIA BERKELEY · 2024 · $427,002

## Abstract

Project Summary/Abstract
The genealogical structure for whole genomes can be described through Ancestral
Recombination Graphs (ARGs). ARGs are summaries that contain all of the information in
genomic sequencing data about processes such as demographic history, selection, and
recombination. The primary objective of this research is to develop a suite of computational
tools that use posterior sampling of ARGs in order to provide methods for testing hypotheses
about the distribution and evolution of genomic variation, and in general, to provide improved
quantification of mutation, recombination, selection, and demographic history. These methods
will be full likelihood/Bayesian methods that can take advantage of the rich population genetic
information in whole-genome sequencing data. We expect the methods to scale up to allow
posterior sampling of ARGs from a coalescence prior for many hundreds, or perhaps thousands,
of genomes. We will make an open-source, user-friendly, flexible, and integrated program
available to other researchers that will allow them to test a wide range of demographic and
evolutionary hypotheses on their own data. We will also develop associated methods for
ancestral inference of past migration and the geographic location of ancestors of an individual.
Additionally, we will develop improved methods for quantifying spatiotemporal patterns of
natural selection affecting the genome. We will apply the methods to modern and ancient DNA
to test hypotheses about the relative contribution of demographic processes and natural
selection for shaping the landscape of phenotypic variation in Europe, including disease
susceptibility. We will also use the methods to revisit an ongoing controversy of the relative
importance of changing mutation patterns and changing generation times in shaping the
pattern of human mutation variation. Finally, we will use the methods to develop more
accurate human recombination maps and to test hypotheses about recombination rate
variation.
 In addition to this, we will develop new Bayesian Markov Chain Monte Carlo methods
for estimating Developmental Lineage Trees (DLTs) using mitochondrial heteroplasmies and
single cell DNA sequencing. We will also develop methods that can jointly analyze single cell
RNA sequencing and DNA sequencing data to make joint models of DLTs with associated
transitions in expression state. Such explicit temporal models of cell differentiation will be
central in the translational aspects of cell specific analyses, in particular for predicting the
effects of various forms of medical intervention.

## Key facts

- **NIH application ID:** 10843006
- **Project number:** 1R35GM153400-01
- **Recipient organization:** UNIVERSITY OF CALIFORNIA BERKELEY
- **Principal Investigator:** RASMUS NIELSEN
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $427,002
- **Award type:** 1
- **Project period:** 2024-06-01 → 2029-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10843006

## Citation

> US National Institutes of Health, RePORTER application 10843006, Inference and application of graphs for genomic data (1R35GM153400-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10843006. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
