# Deep Learning Based Natural Language Processing Markers of Anxiety and Depression

> **NIH NIH K23** · NEW YORK UNIVERSITY SCHOOL OF MEDICINE · 2024 · $195,592

## Abstract

PROJECT SUMMARY / ABSTRACT
 Major Depressive Disorder (MDD) and Generalized Anxiety Disorder (GAD) are among the primary
causes of health burden worldwide. MDD is a leading cause of disability associated with increased morality
risk, and both MDD and GAD result in considerable economic costs, loss of functioning, and decreased quality
of life. One of the biggest challenges in responding to current calls for population-level screening is to monitor
MDD and GAD at a large scale while minimizing assessment burden. Existing assessment methods, however,
rely on subjective measures, are based on diagnostic approaches, and are burdensome in the extent needed
to characterize MDD and GAD in their heterogeneity, which would require combined evaluation of all
symptoms. New methods are needed to accurately assess behavioral health, overcome barriers to monitoring
and care, and advance the scientific understanding of depression and anxiety.
 The proposed study aims to address these gaps by deconstructing MDD and GAD into Digital
Biomarkers (DB) based on linguistic features identified by large language models. State of the art artificial
intelligence and Natural Language Processing methods allow representation learning of DB from cognitive and
emotional domains captured from linguistic information. While effective, passive, and at-scale monitoring are
the primary benefits of DB, we will also use them to study relevant Research Domain Criteria (RDoC),
including negative valence system reactions and positive valence traits. The study goals are to: 1) Design DB
of MDD and GAD symptoms using deep learning methods, by training an attention-based language model on a
very large corpus of de-identified psychotherapy treatment transcripts; 2) Examine preliminary performance
and feasibility of the DB model in a highly characterized sample of MDD and GAD patients, and compare
results with clinician ratings; 3) Explore improvements to the DB model based on research paradigms
consistent with RDoC constructs, to further refine DB model pipeline and future deployment in clinical settings.
 The program of research and training described in this mentored patient-oriented research career
development award is aimed at developing systematic digital health approaches to allow dimensional
conceptualization of MDD and GAD consistent with RDoC, enhancing the ease and consistency of detection to
ultimately support targeted interventions. The proposed project is strongly supported by a multidisciplinary
team including the mentorship of Drs. Naomi Simon and Kyunghyun Cho, and the domain expertise of Drs.
Paul Glimcher, Tim Althoff, Zhe Chen, and Tanzeem Choudhury. The experience gained from the award will
enable the pursuit of future R-level studies focusing on advanced computational psychiatry approaches to
further refine DB models to improve passive and objective assessment of behavioral health, and ultimately
improve our empirical understanding of depression and anxiety.

## Key facts

- **NIH application ID:** 10862838
- **Project number:** 5K23MH134068-02
- **Recipient organization:** NEW YORK UNIVERSITY SCHOOL OF MEDICINE
- **Principal Investigator:** Matteo Malgaroli
- **Activity code:** K23 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $195,592
- **Award type:** 5
- **Project period:** 2023-06-08 → 2028-05-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10862838

## Citation

> US National Institutes of Health, RePORTER application 10862838, Deep Learning Based Natural Language Processing Markers of Anxiety and Depression (5K23MH134068-02). Retrieved via AI Analytics 2026-06-01 from https://api.ai-analytics.org/grant/nih/10862838. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*