# Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL)

> **NIH NIH U24** · JOHNS HOPKINS UNIVERSITY · 2020 · $399,999

## Abstract

Project Summary
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) powers the
next generation of computational genomics research using cloud-scale data and compute resources. The
platform is built on a set of established components, including the Terra computing platform and Dockstore
for standards-based sharing of containerized tools and workflows. It also provides multiple entry points for
data access and analysis, including batch workflows with Terra, notebook environments including Jupyter and
RStudio, Bioconductor packages for building analysis on top of AnVIL APIs and services, and will soon offer
Galaxy instances for interactive analysis. By providing a unified environment for data management and
compute, AnVIL eliminates the need for data movement, allows for controlled access to sensitive data and
monitoring, and provides elastic, shared computing resources that can be acquired by researchers as needed.
NIH-sponsored biomedical research is increasingly moving to cloud-based data storage and analysis systems,
with major cloud portals established for GTEx, Kids First, TOPMed, TCGA and several other major initiatives.
However, using these systems together is a challenge. The individual data portals enable researchers to browse
and query their own data but have limited functionality to share data or user registrations across portals or
with cloud based workspaces, like Terra and Galaxy. The recently established NIH Cloud Platform
Interoperability (NCPI) effort aims to address these issues by implementing key interoperability technologies
across multiple NIH institutes. Under this project, we will work the NCPI working groups to define the use
cases and standards for interoperability as well as implement three major technologies recommended by the
NCPI within the Galaxy and R/Bioconductor components of AnVIL. First, we will implement the NIH
Researcher Auth Service (RAS) to provide a common mechanism for researchers to establish their identity and
access data they are authorized to use across Terra and Galaxy. Second, we will implement the Global Alliance
for Genomics and Health (GA4GH) Data Repository Service (DRS) so that data consumers, including
workflow systems, can access data objects in a single, standard way regardless of where they are stored and
how they are managed. Finally, we will develop initial support in AnVIL for the Fast Healthcare
Interoperability Resources (FHIR) standard. This standard describes data formats, elements, and an API for
exchanging electronic health records (EHR), especially to ensure these records are available, discoverable, and
understandable as patients move around the healthcare ecosystem. FHIR support in AnVIL will facilitate
access to eMERGE and related projects by users once the data are ingested in AnVIL.

## Key facts

- **NIH application ID:** 10220581
- **Project number:** 3U24HG010263-03S1
- **Recipient organization:** JOHNS HOPKINS UNIVERSITY
- **Principal Investigator:** Jeremy Goecks
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $399,999
- **Award type:** 3
- **Project period:** 2018-09-21 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10220581

## Citation

> US National Institutes of Health, RePORTER application 10220581, Implementing the Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL) (3U24HG010263-03S1). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10220581. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
