# Scalable tools for consistent identification of neuronal cell types in mouse and human

> **NIH NIH RF1** · ALLEN INSTITUTE · 2021 · $1,081,350

## Abstract

Project Summary
The proposed work will address a critical gap in our understanding of neuronal phenotypes and cell types by
developing machine learning algorithms and cloud-based software for the integration of multiple modality
characterizations large and growing datasets of cortical neurons in mouse and human. Through optimal and
innovative use of potentially incomplete data and emphasis on automated morphological characterization, the
proposed tools will enable richer and consistent characterizations of neurons from transcriptomic, anatomical, or
electrophysiological profiling.
While large-scale, BICCN-funded cell type research programs rely on the notion of unique neuronal identity
which determines the cell’s phenotype across different observation modalities, overarching agreements across
physiological, anatomical and molecular characterizations remain elusive. Although these large-scale programs
succeeded in generating extensive multiple modality datasets, the lack of principled, accurate and widely
available computational alignment and inference tools presents a roadblock to the success of the overall
program. A second issue is that anatomical characterization, despite being the classical approach to
understanding cell types, lags significantly behind molecular and physiological methods in terms of throughput.
The research proposed here aims to address the alignment problem by building on the coupled autoencoder
approach, which presents an efficient optimization framework centered on the ubiquity of neuronal identity.
Importantly, the proposed software can utilize incompletely characterized data points, which is common in
practice, to produce unified visualization and analysis of abstract neuronal identity. This tool will be both flexible
(e.g., the feature set can be changed) and extensible (e.g., more observation modalities can be added for joint
alignment). The aligned representations enable consistent clustering of the neuronal population across the
different observation modalities, which is a pressing problem in modern neuroscience.
We propose to address the anatomical throughput issue with an end-to-end computational pipeline, from the raw
image of local neuronal arbors to the anatomical descriptor that can be readily aligned and interpreted by the
coupled autoencoder software. By utilizing our extensive gold-standard manual reconstructions, we will train
supervised deep artificial neural networks to segment neuronal arbors in sparse labeling scenarios. The rich set
of training examples, together with algorithmic innovations, will endow superior generalizability of this automated
segmentation tool, accelerating science for light microscopy-based studies.

## Key facts

- **NIH application ID:** 10365216
- **Project number:** 1RF1MH128778-01
- **Recipient organization:** ALLEN INSTITUTE
- **Principal Investigator:** Staci A Sorensen
- **Activity code:** RF1 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $1,081,350
- **Award type:** 1
- **Project period:** 2021-09-16 → 2024-09-15

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10365216

## Citation

> US National Institutes of Health, RePORTER application 10365216, Scalable tools for consistent identification of neuronal cell types in mouse and human (1RF1MH128778-01). Retrieved via AI Analytics 2026-05-28 from https://api.ai-analytics.org/grant/nih/10365216. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
