PROJECT SUMMARY Along with differences in the environment, genetic variation is the ultimate source of phenotypic diversity within and between species. Due to the limitations of traditional short-read sequencing technologies, most research in human genetics and evolution has focused on single-nucleotide variants (SNVs) and short insertions and deletions. The recent development of long-read sequencing technologies has begun to reveal the prevalence and phenotypic impacts of larger insertions, deletions, and rearrangements, collectively termed structural variants (SVs). However, long-read sequencing methods remain impractical for large-scale application due to their high cost and low throughput. Consequently, the role of SVs in human evolution is still poorly understood. These challenges motivate the development of innovative computational approaches that combine the accuracy of SV discovery using long reads with the scale and global diversity of short-read sequencing datasets. To advance knowledge of how SVs impact fitness and genome function in humans, this proposed research project will: 1. Identify locally adaptive SVs by leveraging graph-based methods for SV genotyping. This approach enables accurate genotyping of SVs discovered with long reads in short-read datasets from diverse human populations. The population-wide genotypes generated with this method will allow for the discovery of signatures of historical positive selection on SVs. 2. Discover SVs that are shared among or exclusive to the modern human, Neanderthal, and Denisovan lineages using both graph genotyping and alignment-free methods. Placing SVs in their comparative evolutionary contexts will reveal divergent variants that may underlie important functional differences that distinguished these hominin groups. 3. Quantify the functional genomic impacts of SVs by combining long-read and short-read RNA sequencing of diverse human individuals. This data will reveal insights into how SVs may mediate phenotypic differences through effects on gene expression and splicing. This research will be conducted in a strong genetics and genomics training environment, and will combine the evolution, computational genomics, and long-read sequencing expertise and resources of my sponsor, co- sponsor, and collaborator. My project will provide me with scientific training in evolutionary modeling, development of software tools, and functional genomic data analysis. Meanwhile, the institutional environment will also facilitate my training in the communication, teaching, mentorship, and leadership skills that are essential for becoming a leading researcher in human genetics.