# Data Science Core

> **NIH NIH U01** · UNIVERSITY OF MASSACHUSETTS AMHERST · 2021 · $84,595

## Abstract

SUMMARY / ABSTRACT – Data Science Core
With the introduction of the Big Data paradigm, scientific investigation has become increasingly dependent on
the ability to collect, manage, and process large amounts of data. Unfortunately, the scientific benefits of
abundant data sources have often been dwarfed by the inadequacy of the existing data science tools, as
scientists spent excessive time managing and processing data instead of focusing on the science. Worse, people
would face insurmountable challenges to associate the results published in journal papers with the data and
workflows needed to replicate the science. The Data Science Core of the Berghia Brain Project plans to provide
the large distributed team of scientists the cyberinfrastructure needed to accomplish their aims without diverting
time and energy from the science activities. In particular, each of the five Research Project's activities will require,
to varying degrees, easy, intuitive access to a combination of massive storage, imaging devices, cloud
resources, and high-performance computing (HPC) platforms to execute scientific workflows in a reliable and
repeatable manner. Internal collaborations and external dissemination of data and results will challenge existing
solutions, necessitating a tailored cyberinfrastructure for the Berghia Brain Project. The strategy used to develop
this cyberinfrastructure relies on the following three complementary capabilities: (i) Create a data management
infrastructure that connects all the institutions in a federated data store; (ii) Develop a scalable computing
infrastructure that allows processing in a timely manner the massive amounts of data acquired by each project;
and (iii) Develop services for interoperability and provenance tracking that enable sharing software tools, science
workflows, and versioned data that will allow publishing the BBP findings with the best standards of reproducible
science.

## Key facts

- **NIH application ID:** 10302201
- **Project number:** 1U01NS123972-01
- **Recipient organization:** UNIVERSITY OF MASSACHUSETTS AMHERST
- **Principal Investigator:** Valerio Pascucci
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $84,595
- **Award type:** 1
- **Project period:** 2021-09-15 → 2023-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10302201

## Citation

> US National Institutes of Health, RePORTER application 10302201, Data Science Core (1U01NS123972-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10302201. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
