Efficient and scalable pangenomes with the move structure

NIH RePORTER · NIH · R21 · $198,495 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Pangenome references and indexes have been shown to alleviate the reference bias problem. Computer scientists recently described the novel “move structure,” which supports similar pattern-matching capabil- ities as the more typical r-index or F M -index structures, but with radically improved locality of reference. That is, move-structure algorithms access computer memory in a predictable way that minimizes cache misses, or other kinds of pauses due to data movement. We will adapt the “move structure” to the problem of pangenome indexing, enabling extremely and consistently fast pangenome queries. This will allow us to leverage inclusive and bias-avoiding pangenomes in applications where (a) we must keep up with a sequencer in real-time, e.g. nanopore sequencing, or (b) the index is so big that we must divide it across many computers, e.g. BLAST-like sequence classification.

Key facts

NIH application ID
10813518
Project number
1R21HG013433-01
Recipient
JOHNS HOPKINS UNIVERSITY
Principal Investigator
Benjamin Thomas Langmead
Activity code
R21
Funding institute
NIH
Fiscal year
2024
Award amount
$198,495
Award type
1
Project period
2024-02-01 → 2026-02-28