# Data Science Core

> **NIH NIH U19** · COLUMBIA UNIV NEW YORK MORNINGSIDE · 2021 · $665,877

## Abstract

Summary/Abstract (Data Science Core B)
This project—and the International Brain Laboratory (IBL) more generally—represents a tightly closed loop of
experiment, theory, and data analysis. This loop depends critically on sophisticated, scalable, and robust data
science resources and methods. This core will provide these resources and methods.
 First, this core will extend the existing IBL data architecture to handle the new datasets that will be
collected as part of this project (Aim 1). IBL already uses this data architecture daily to collect experimental data
(including behavioral video and Neuropixels recordings) and metadata; automatically preprocess and analyze
the data; automatically transfer the data to a central server; and share the results within the collaboration and
externally. This core will extend this architecture to handle the new experiments and data types (calcium imaging,
functional ultrasound imaging, optogenetic perturbations, in situ sequencing) to be pursued here.
 Second, this Core will apply and refine sophisticated data-analysis algorithms directly related to the
project’s scientific goals, and serve these algorithms publicly as open-source tools to the broader community
(Aim 2). IBL already has good working pipelines in place to preprocess (spike sort) Neuropixels datasets. During
the proposed project, this core will continue to refine and improve these spike-sorting pipelines, incorporate new
pipelines to handle additional large data types (calcium imaging and in situ sequencing data) that are not
currently in place in the IBL infrastructure, and support development of methods for analyzing large-scale multi-
neuronal recordings from multiple brain areas over multiple experiments. All analytical and data-architecture
tools will be versioned, open-source, and immediately available for use and development by other laboratories.
A major synergistic aspect of IBL is that these pipelines will be heavily internally tested by many users with a
wide variety of expertise across multiple labs. These tools will also be served on the Neuroscience Cloud
Analysis as a Service platform to facilitate reproducible, easy usage. We thus expect the availability of these
new tools to have an immediate and broad impact on the field.

## Key facts

- **NIH application ID:** 10294670
- **Project number:** 1U19NS123716-01
- **Recipient organization:** COLUMBIA UNIV NEW YORK MORNINGSIDE
- **Principal Investigator:** Liam M Paninski
- **Activity code:** U19 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $665,877
- **Award type:** 1
- **Project period:** 2021-08-15 → 2026-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10294670

## Citation

> US National Institutes of Health, RePORTER application 10294670, Data Science Core (1U19NS123716-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10294670. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
