DANDI: Distributed Archives for Neurophysiology Data Integration

NIH RePORTER · NIH · R24 · $1,898,066 · view on reporter.nih.gov ↗

Abstract

Neuroscientific data contain information from an incredible diversity of species and modalities that are generated by a plethora of devices, and encapsulate the results of scientific thinking and decision making. The BRAIN Initiative has spearheaded a comprehensive informatics initiative to gather much of this neuroscientific data into standardized representations and to disseminate it through accessible platforms. DANDI - Distributed Archives for Neurophysiology Data Integration, is one such current effort to facilitate the aggregation and dissemination of neurophysiology research data using best practices and standards, and has grown to accommodate about 400TB of data across 100+ published datasets in slightly over 2 years. The archive supports a broad range of users with different levels of expertise by providing a spectrum from Web-based to programmatic mechanisms to access and upload data and helps improve the expertise through training of the scientific user base through tutorials and workshops. We expect future datasets to be larger and more multimodal, ranging in size from many TBs to PBs, with richer metadata. To support the next generation of neuroscience researchers and to support the scales of computation and storage that will become necessary, we must archive, preserve, and process this data in a scalable and accessible way that is meaningful to both neuroscience researchers and software developers. In Aim 1, we will integrate neurophysiology applications that scientists can easily use on large and diverse datasets to derive new insights and generate interactive figures, directly connecting the provenance claims to underlying data. In Aim 2, we will expand search functionality to query into the structure of individual data streams to enable more complex queries that enable more precise interrogation and advanced analysis of data and help answer more specific neuroscientific questions. We will improve search to span information within DANDI and to facilitate linking and integration of DANDI data with related data available in other BRAIN Initiative archives. In Aim 3, we will improve interoperability of data in DANDI with other neurophysiology software tools, platforms, and applications, thereby strengthening the ecosystem of neurophysiology research. Community engagement and data reuse will be further enhanced through yearly workshops aimed at improving the quality of data and metadata and training users to use DANDI tools and data. Overall, we will address the growing data management and dissemination needs of the neurophysiology community through a scalable, robust, interoperable, and standards-based neurophysiology archive that provides an easy to use graphical and interactive interface as well as computation services close to large datasets that can be accessed simply with a Web browser. We will provide a platform for seamlessly integrating with and enhancing existing research workflows. We aim to support scientific inquiry and c...

Key facts

NIH application ID
10665988
Project number
2R24MH117295-06
Recipient
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Principal Investigator
Satrajit Sujit Ghosh
Activity code
R24
Funding institute
NIH
Fiscal year
2024
Award amount
$1,898,066
Award type
2
Project period
2019-08-01 → 2029-04-30