# Refining mutation rates and measures of purifying selection with an application to understanding the impact of non-coding variation on neuropsychiatric diseases

> **NIH NIH R01** · UNIVERSITY OF CHICAGO · 2021 · $410,119

## Abstract

Project Summary
Mutation and natural selection are fundamental forces of evolution, and their intensities across the genome are
key factors in determining the genomic landscape of human genetic disease variation and evolution. The goal
of the proposal is to construct a detailed map of mutation rates and purifying selection along the human
genome using novel statistical methodologies. Existing approaches to estimating mutation rates and selection
are often based on genome comparison across species, but for the purpose of studying human genetics and
evolution, we believe those inferred from the human population are more relevant and increasingly feasible
thanks to large-scale sequencing. Statistical methods for intra-human analysis, however, are in their infancy,
and face a number of challenges; for example, many factors affecting mutation rates are unknown and
complex human demographic changes complicate the inference of selection.
We propose three specific aims: (1) Estimation of base-level mutation rates across the human genome. We will
use de novo mutations from pedigree sequencing data to directly estimate germline mutation rates. Our model
will incorporate a large set of genomic features potentially associated with mutation rates, including novel ones
not utilized by earlier methods such as DNA structure and epigenomic information in germ line cells. Our
statistical model also incorporates a random effect component and captures spatial correlations of mutation
rates between nearby regions at multiple scales. (2) Inference of purifying selection in the human genome.
Existing methods for detecting intra-species constraint often rely on one of multiple signatures of selection a
time (e.g. depletion of variants comparing with neutral expectation), and have limited power in detecting
selection on individual elements, such as a putative enhancer.! We will develop a unified statistical model that
leverages several major signals to detect selection at both base and element levels. Our model uses the
powerful Poisson Random Field (PRF) model, taken complex human demographic history into account. We
also leverage mutation rates estimates from Aim 1 and use a number of genomic annotations to set prior
distribution of selection effects through a hierarchical Bayesian model. (3) Studying the role of human
constrained sequences in disease genetics. We hypothesize that sequences under selective constraint in
human, both coding and noncoding ones, are highly enriched with disease causing variants. We will test this
hypothesis using data from Genome-wide Association studies (GWAS), with a special focus on
neuropsychiatric phenotypes. We will develop procedures that leverage both functional genomic data and
selective constraints to prioritize disease variants.

## Key facts

- **NIH application ID:** 10245296
- **Project number:** 5R01HG010773-02
- **Recipient organization:** UNIVERSITY OF CHICAGO
- **Principal Investigator:** Xin He
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $410,119
- **Award type:** 5
- **Project period:** 2020-09-01 → 2024-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10245296

## Citation

> US National Institutes of Health, RePORTER application 10245296, Refining mutation rates and measures of purifying selection with an application to understanding the impact of non-coding variation on neuropsychiatric diseases (5R01HG010773-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10245296. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
