Data Science Core

NIH RePORTER · NIH · U01 · $84,595 · view on reporter.nih.gov ↗

Abstract

SUMMARY / ABSTRACT – Data Science Core With the introduction of the Big Data paradigm, scientific investigation has become increasingly dependent on the ability to collect, manage, and process large amounts of data. Unfortunately, the scientific benefits of abundant data sources have often been dwarfed by the inadequacy of the existing data science tools, as scientists spent excessive time managing and processing data instead of focusing on the science. Worse, people would face insurmountable challenges to associate the results published in journal papers with the data and workflows needed to replicate the science. The Data Science Core of the Berghia Brain Project plans to provide the large distributed team of scientists the cyberinfrastructure needed to accomplish their aims without diverting time and energy from the science activities. In particular, each of the five Research Project's activities will require, to varying degrees, easy, intuitive access to a combination of massive storage, imaging devices, cloud resources, and high-performance computing (HPC) platforms to execute scientific workflows in a reliable and repeatable manner. Internal collaborations and external dissemination of data and results will challenge existing solutions, necessitating a tailored cyberinfrastructure for the Berghia Brain Project. The strategy used to develop this cyberinfrastructure relies on the following three complementary capabilities: (i) Create a data management infrastructure that connects all the institutions in a federated data store; (ii) Develop a scalable computing infrastructure that allows processing in a timely manner the massive amounts of data acquired by each project; and (iii) Develop services for interoperability and provenance tracking that enable sharing software tools, science workflows, and versioned data that will allow publishing the BBP findings with the best standards of reproducible science.

Key facts

NIH application ID
10302201
Project number
1U01NS123972-01
Recipient
UNIVERSITY OF MASSACHUSETTS AMHERST
Principal Investigator
Valerio Pascucci
Activity code
U01
Funding institute
NIH
Fiscal year
2021
Award amount
$84,595
Award type
1
Project period
2021-09-15 → 2023-08-31