The regulatory landscape of segmentally duplicated genes: Implications for human evolution and disease

NIH RePORTER · NIH · K99 · $114,576 · view on reporter.nih.gov ↗

Abstract

Contact PD/PI: Vollger, Mitchell PROJECT SUMMARY Objective and Specific Aims: This project seeks to comprehensively understand how gene regulatory elements within rapidly evolving areas of the human genome, segmental duplications, influence human evolution and disease. Despite their potential significance, these regions have historically been challenging to study due to technical limitations. The specific aims of this project are: 1. Characterize segmental duplications across the human population by constructing a pangenome graph using thousands of high-quality genome assemblies. 2. Establish a statistical and computational methodology for mapping regulatory DNA within SDs using long- read chromatin fiber sequencing (Fiber-seq). 3. Identify conserved regulatory and genomic elements within segmental duplication loci by mapping genetic and epigenetic haplotypes into the pangenome graph. 4. Uncover the regulatory fate of multi-copy gene families by analyzing segmental duplication paralogs with Fiber-seq across tissues, determining if these paralogs have undergone changes in regulatory function. Relevance to the Agency's Mission: This research directly aligns with the institute’s mission to understand the genetic underpinnings of human evolution and disease. SD regions, due to their rapid evolution and complexity, have remained challenging to study. Yet, they hold invaluable insights into human-specific genetic adaptations and susceptibilities to diseases. Identifying and characterizing regulatory elements in these fast-evolving genomic regions will offer novel insights into human-specific traits, as well as potential vulnerabilities to diseases. Research Design and Methods: In this work, I will create a comprehensive SD pangenome graph by integrating thousands of long-read haplotypes from multiple consortia, which will significantly enhance our understanding of human variation within SDs. Next, I will use long-read Fiber-seq in conjunction with the development of a machine-learning framework to detect regulatory elements within SDs and use that information to impute the results of other short-read epigenetic assays. My approach will also involve a conservation analysis that prioritizes SD genes and regulatory elements. I will introduce a novel 'loss-of-paralog intolerance' (pLPI) score to rank these genes based on their conservation levels across populations. Additionally, the regulatory trajectories of SD genes will be determined using Fiber-seq conducted on a range of human tissues. This will help me identify distinct patterns such as neofunctionalization, subfunctionalization, or pseudofunctionalization. This investigative approach will deliver an in-depth understanding of the regulatory mechanisms in SDs using cutting-edge genomic tools. The insights gained have the potential to highlight human-specific regulatory adaptations and could pave the way for discovering new therapeutic avenues in personalized medicine. 1

Key facts

NIH application ID
10947161
Project number
1K99GM155552-01
Recipient
UNIVERSITY OF WASHINGTON
Principal Investigator
Mitchell R. Vollger
Activity code
K99
Funding institute
NIH
Fiscal year
2024
Award amount
$114,576
Award type
1
Project period
2024-08-01 → 2026-07-31