OpenCRAVAT: Informatics Tools for High-Throughput Analysis of Cancer Mutations

NIH RePORTER · NIH · U24 · $633,077 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY Cancer sequencing projects have identified a very large number of DNA mutations whose importance in cancer is not yet understood. To better understand the impact of these mutations, our team has produced a software tool for computational analysis of cancer mutations that can analyze millions of mutations at one time. This tool works as a funnel to help researchers to find the small number of mutations that are most likely to be informative from the very large number of mutations discovered in a sequencing project. The software allows users to design ways to combine multiple mutation evaluation metrics, and generate a prioritized list of mutations that are more likely to be biologically important. These evaluation metrics include the molecular consequence, bioinformatic scores to identify pathogenic and driver mutations, frequency of the mutation in human populations, previous occurrence in tumor tissue types, pointers to literature, and visualization of annotated protein structures and networks. A web-based version of the pipeline - Cancer Related Analysis of Variants Toolkit (CRAVAT) has been widely adopted (3000+ jobs submitted/month on average in 2020). We have attracted a user community that spans both basic and clinical cancer researchers, all of whom rely on high-throughput tumor sequencing in their work. In 2019, we introduced OpenCRAVAT, which is distinguished by an open source codebase and an open app store of tools and resources that can be used to better understand the importance and impact of mutations. The app store is driven by the user community; new apps are prioritized based upon user requests and the app store includes many apps that were contributed directly by outside tool developers. The app store currently aggregates tools from over 70 organizations, and these tools can be combined to identify mutations whose molecular impact contributes to tumorigenesis, prognosis and treatment selection. Initial adoption of our OpenCRAVAT tool is encouraging, with over 10,000 local package downloads in the first two years. We expect that OpenCRAVAT will be adopted by a much larger community, given the increasing importance of DNA sequencing data in cancer research. We will continue to ensure that our tools are interoperable with other informatics tools and services, and can be run in different computational environments such as cloud computing and local installation to maintain data privacy.

Key facts

NIH application ID
10830980
Project number
5U24CA258393-03
Recipient
JOHNS HOPKINS UNIVERSITY
Principal Investigator
Rachel Karchin
Activity code
U24
Funding institute
NIH
Fiscal year
2024
Award amount
$633,077
Award type
5
Project period
2022-05-15 → 2027-04-30