DMS/NIGMS 1: Multilayer network approach to tandem repeat variation in genomes

NIH RePORTER · NIH · R01 · $148,495 · view on reporter.nih.gov ↗

Abstract

Understanding the genetic bases of biological function is a fundamental quest ion in biological sciences. Traditionally, the conservation of genetic sequences across species and populations has been a primary concept with which to measure functionality. However, recent biochemical characterizations of the DNA have challenged this definition of functionality and argued up to 80% of the human genome to be functional. Several studies have pursued the possibility that biological function evolves as an adaptive response to rapid changes under environmental pressures whereby sequence conservation does not directly predict function. By integrating -omics datasets and multilayer network approaches, we will specifically test the following four hypotheses: (1) Among the millions of tandem repeats, a small portion, still corresponding to thousands of loci, are functionally relevant. We further hypothesize that majority of these functional tandem repeats will be evolving under negative selection and primarily cluster together in multilayer networks of tandem repeat units. (2) Exonic tandem repeats have evolved as molecular tools to regulate the dosage of a particular functional motif. Thus, we expect that these functional tandem repeats will retain sequence conservation among paralogs as well as among species. (3) There are hundreds of tandem repeats in the mammalian genome that evolve under lineage-specific positive selection. We expect that such positively selected tandem re peats show unusual species-specific copy number expansions or contractions, and may affect gene expression and phenotypic traits more often than neutrally evolving tandem repeats. (4) Tandem-repeat copy number variation, if functional, primarily effects phenotypic variation related to immunity and metabolism in humans. We expect that these repeat loci evolve under positive selection. To test these hypotheses, we will develop mathematical/computational methods to find groups of core nodes in multilayer genetic networks, and then apply them to multilayer networks that we will build, in which each network layer is based on a specific type of relationships between tandem repeat units.

Key facts

NIH application ID
10919783
Project number
5R01GM148973-03
Recipient
STATE UNIVERSITY OF NEW YORK AT BUFFALO
Principal Investigator
Naoki Masuda
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$148,495
Award type
5
Project period
2022-09-24 → 2026-06-30