# Data Science Core

> **NIH NIH U19** · PRINCETON UNIVERSITY · 2021 · $454,067

## Abstract

Project Summary: Core 2, Data Science 
 
Working memory, the ability to temporarily hold multiple pieces of information in mind for manipulation, is 
central to virtually all cognitive abilities. This multi-component research project aims to comprehensively 
dissect the neural circuit mechanisms of this ability across multiple brain areas. In doing so, it will generate an 
extremely large quantity of data, from multiple types of experiments, which will then need to be integrated 
together. The Data Science Core will support the individual research projects in discovering relationships 
among behavior, neural activity, and neural connectivity. The Core will create a standardized computational 
pipeline and human workflow for preprocessing of calcium-imaging data. The pipeline will run either on local 
computers or in cloud computing services, and users will interact with it through a web browser. The 
preprocessing will incorporate existing image-processing algorithms, such as Constrained Nonnegative Matrix 
Factorization and convolutional networks. In addition, the Core will build a data science platform that stores 
behavior, neural activity, and neural connectivity in a relational database that is queried by the DataJoint 
language. Diverse analysis tools will be integrated into DataJoint, enabling the robust maintenance of 
data-processing chains. This data-science platform will facilitate collaborative analysis of datasets by multiple 
researchers within the project, and make the analyses reproducible and extensible by other researchers. We 
will develop effective methods for training and otherwise disseminating our computational tools and 
work flows. Finally, the Core will make raw data, derived data, and analyses available to the public upon 
publication via the data-science platform, source-code repositories, and web-based visualization tools. To 
facilitate the conduct of this research, the creation of software tools, and the reuse of the data by others after 
the primary research has concluded, the project will adopt shared data and metadata formats using the HDF5 
implementation of the Neurodata without Borders format. Data will be made public in accord with the FAIR 
guiding principles — findable by a DOI and/or URL, accessible through a RESTful web API, and interoperable 
and reusable due to DataJoint and the Neurodata Without Borders format for data and metadata. These tools 
will allow the researchers within the project to store, manipulate, and analyze their data efficiently and to share 
it with other researchers as needed.

## Key facts

- **NIH application ID:** 10247579
- **Project number:** 5U19NS104648-05
- **Recipient organization:** PRINCETON UNIVERSITY
- **Principal Investigator:** Hyunjune SEBASTIAN SEUNG
- **Activity code:** U19 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $454,067
- **Award type:** 5
- **Project period:** 2017-09-28 → 2023-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10247579

## Citation

> US National Institutes of Health, RePORTER application 10247579, Data Science Core (5U19NS104648-05). Retrieved via AI Analytics 2026-05-21 from https://api.ai-analytics.org/grant/nih/10247579. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
