Web-based visualization of coronavirus genomes and proteins

NIH RePORTER · NIH · R01 · $354,767 · view on reporter.nih.gov ↗

Abstract

Project Summary This supplemental proposal knits together several bioinformatics visualization tools in the service of SARS-CoV-2 genome analysis. The core of the proposal is a newly-prototyped JavaScript viewer, ABrowse, that is capable of rendering multiple sequence alignments, navigable by phylogenetic trees, and integrated with protein structure views, all in a single embeddable component. The ABrowse viewer is currently employed to render the Pfam SARS-CoV-2 special release: a collection of 40 protein domains from the coronavirus genome, along with PDB structures. (ABrowse is also a candidate for Pfam's future default viewer, as noted in the letters of support.) We propose to accelerate ABrowse development for use by the COVID-19 pandemic, specifically targeting scaling, performance, and integration issues that are most relevant to scientists studying the virus. Chief amongst these is scaling ABrowse to handle millions of protein sequences (and/or SARS-CoV-2 genome sequences) by means of a new, compressed storage format suitable for random-access user-driven exploration of very large trees (and alignments) over the web. Beyond scaling, we also address integration, developing plugins for ABrowse to run within JBrowse (the genome browser that is the focus of the project to which this is a supplemental proposal) as well as Auspice (the web dashboard of NextStrain, the phylogenetic genome alignment and annotation package that is widely used for COVID-19 analysis). We also propose several user interface enhancements to make ABrowse more useful as a navigation tool for COVID-19 data.

Key facts

NIH application ID
10162044
Project number
3R01HG004483-12S1
Recipient
UNIVERSITY OF CALIFORNIA BERKELEY
Principal Investigator
Ian H Holmes
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$354,767
Award type
3
Project period
2020-09-04 → 2021-06-30