COINSTAC 2.0: decentralized, scalable analysis of loosely coupled data

NIH RePORTER · NIH · R01 · $378,525 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract The brain imaging community is greatly benefiting from extensive data sharing efforts currently underway. However, there is still a major gap in that much data is still not openly shareable, which we propose to address. In addition, current approaches to data sharing often include significant logistical hurdles both for the investigator sharing the data (e.g. often times multiple data sharing agreements and approvals are required from US and international institutions) as well as for the individual requesting the data (e.g. substantial computational re- sources and time is needed to pool data from large studies with local study data). This needs to change, so that the scientific community can create a venue where data can be collected, managed, widely shared and analyzed while also opening up access to the (many) data sets which are not currently available (see overview on this from our group7). The large amount of existing data requires an approach that can analyze data in a distributed way while (if required) leaving control of the source data with the individual investigator or the data host; this motivates a dynamic, decentralized way of approaching large scale analyses. During the previous funding period, we developed a peer-to-peer system called the Collaborative Informatics and Neuroimaging Suite Toolkit for Anonymous Computation (COINSTAC). Our system provides an independent, open, no-strings-attached tool that performs analysis on datasets distributed across different locations. Thus, the step of actually aggregating data is avoided, while the strength of large-scale analyses can be retained. During this new phase we respond to the need for advanced algorithms such as linear mixed effects models and deep learning, by proposing to develop decentralized models for these approaches and also implement a fully scalable cloud-based framework with enhanced security features. To achieve this, in Aim 1, we will incorporate the necessary functionality to scale up analyses via the ability to work with either local or commercial private cloud environments, together with advanced visualization, quality control, and privacy and security features. This suite of new functions will open the floodgates for the use of COINSTAC by the larger neuroscience community to enable new discovery and analysis of unprecedented amounts of brain imaging data located throughout the world. We will also improve usability, training materials, engage the community in contributing to the open source code base, and ultimately facilitate the use of COINSTAC's tools for additional science and discovery in a broad range of applications. In Aim 2 we will extend the framework to handle powerful algorithms such as linear mixed effects models and deep learning, and to perform meta-learning for leveraging and updating fit models. And finally, in Aim 3, we will test this new functionality through a partnership with the worldwide ENIGMA addiction group, which ...

Key facts

NIH application ID
10622017
Project number
3R01DA040487-08S1
Recipient
GEORGIA STATE UNIVERSITY
Principal Investigator
VINCE D CALHOUN
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$378,525
Award type
3
Project period
2015-07-01 → 2025-06-30