# Advancing Computational Linguistic Biomarkers of Disorganized Speech in Psychosis

> **NIH NIH K23** · FEINSTEIN INSTITUTE FOR MEDICAL RESEARCH · 2022 · $195,480

## Abstract

Project Summary
Disorganization in psychosis has important clinical implications but is under-studied. Several lines of evidence
suggest that disorganization reflects higher genetic loading and worse outcomes, and is sensitive to treatment
response and relapse. We will use computational linguistics to measure disorganization in a sensitive,
objective, efficient, reproducible, and repeatable way. Speech will be elicited with open-ended and structured
tasks from 270 people with schizophrenia spectrum disorders and mood disorders with psychotic features,
generating ~30,000 sentences across the sample. Findings will be validated in an existing independent
dataset. We will measure psychosis symptoms, functioning, and cognition in both samples. Incoherence and
inefficiency will be labeled for individual sentences and rated for the overall participant. Our Specific Aims are
as follows: (1) Develop deep-learning methods to classify sentence-level disorganization; (2) Integrate across
computational features to predict participant-level disorganization; (3) Predict key participant characteristics
using linguistic features. An integrated training plan will combine hands-on experience through these research
aims with mentorship, coursework, self-study, seminars, and conferences to achieve the following Training
Goals: (1) Proficiency in computational linguistics and machine learning methods, (2) Expertise in validating
clinically-relevant biomarkers, and (3) Development as a physician-scientist. This work provides the foundation
needed to develop cutting-edge computational methods into biomarkers of disorganization and key psychosis
outcomes. We lay the groundwork for future studies that leverage these features as early markers of treatment
response and relapse, and that use these features to connect behavioral phenotypes with underlying biology.
The proposed project builds on my existing expertise to develop the technical proficiency and expertise in
psychosis biomarker research I need to lead new discoveries in this area as an independent physician-
investigator.

## Key facts

- **NIH application ID:** 10507015
- **Project number:** 1K23MH130750-01
- **Recipient organization:** FEINSTEIN INSTITUTE FOR MEDICAL RESEARCH
- **Principal Investigator:** Sunny Xiaojing Tang
- **Activity code:** K23 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $195,480
- **Award type:** 1
- **Project period:** 2022-09-01 → 2026-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10507015

## Citation

> US National Institutes of Health, RePORTER application 10507015, Advancing Computational Linguistic Biomarkers of Disorganized Speech in Psychosis (1K23MH130750-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10507015. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
