# C2: Data Science

> **NIH NIH U19** · PRINCETON UNIVERSITY · 2024 · $663,334

## Abstract

Project Summary/Abstract: Core 2, Data Science
The Data Science Core will facilitate and standardize data collection and analysis for all research projects
within this U19 program. In particular, we will develop processes and systems for collecting, organizing, and
analyzing behavioral, imaging, electrophysiology, and neural manipulation data. To benefit the broader
neuroscience community, we will adopt shared data and metadata formats and make our pipelines publicly
available, with DataJoint as the common framework for scientific data pipelines and the Neurodata without
Borders format to share large raw data. This platform will facilitate collaborative analysis of datasets by multiple
researchers within the project, and make our analyses reproducible and extensible by others. We will make our
code and data public in easy-to-find, open-access repositories, such as the BRAIN Initiative’s Distributed
Archives for Neurophysiology Data Integration and Github. Our use of these common data standards will make
the data interoperable and reusable, thus ensuring that our data publications adhere to FAIR guidelines.
 The core’s first aim will be to provide standardized computational pipelines for neurophysiological and
behavioral data. We already have standardized data pipelines for collection of virtual-reality behavioral data
and preprocessing of mesoscope imaging and Neuropixels electrophysiology recordings. We now propose to
extend this effort to all data generated by the collaboration, via three new initiatives. First, we will construct a
shared platform, accessed by modular, user-friendly web apps, to support virtual-reality and
operant-conditioning tasks. Second, we will extend our preprocessing pipeline for electrophysiology and
calcium imaging data to support several state-of-the-art segmentation algorithms. Automation of
preprocessing, data transfer between systems, and standardization of manual curation steps will make
analyses faster and easier, enabling more effective and reproducible processing of neurophysiological data.
Third, we will develop infrastructure to support perturbations during behavior, including optogenetic,
pharmacological, and physical manipulations.
 The core’s second aim will be to document the system, train users, and disseminate our computational
tools and workflows. This effort will alleviate burdens on researchers, accelerate research by promoting
standard software tools, increase adoption of standardized pipelines, and facilitate reuse of our data by others.
To facilitate training and use of these pipelines, we will develop integrated web-based tools that allow
world-wide access and control of local data processing. The modular nature of these tools will make them
useful to and popular with the broader neuroscience community. We will provide continuous, in-person training
to all researchers and technicians, including yearly tutorials with external consultants. Together, these methods
for automating and standardizing da...

## Key facts

- **NIH application ID:** 10900694
- **Project number:** 5U19NS132720-02
- **Recipient organization:** PRINCETON UNIVERSITY
- **Principal Investigator:** Carlos D Brody
- **Activity code:** U19 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $663,334
- **Award type:** 5
- **Project period:** 2023-08-08 → 2028-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10900694

## Citation

> US National Institutes of Health, RePORTER application 10900694, C2: Data Science (5U19NS132720-02). Retrieved via AI Analytics 2026-05-28 from https://api.ai-analytics.org/grant/nih/10900694. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
