# Crowd-Powered Machine Learning to Diagnose ASD and ADHD in Adolescents from Digital Social Interactions

> **NIH NIH DP2** · UNIVERSITY OF HAWAII AT MANOA · 2023 · $94,037

## Abstract

Project Summary
Digital technologies have the potential to provide remote and accessible psychiatric diagnostics to underserved
families who have traditionally been left out of the healthcare system. Several recent research efforts have
explored the use of structured behavioral data collection using digital devices coupled with automatic machine
learning (ML) algorithms to distinguish a particular psychiatric condition from neurotypical controls. While these
pure ML approaches have achieved performances >90% on balanced classification metrics on binary prediction
tasks, there are limits to their ability to feasibly quantify the complex, social behavioral features needed for multi-
condition and higher precision diagnostics. To enable such specificity for digital psychiatric diagnostics, I propose
a novel paradigm-shifting approach which incorporates original crowdsourcing algorithms into the ML feature
extraction process to create representation vectors of nuanced social behaviors with sufficient discriminative
power to distinguish related and overlapping neuropsychiatric conditions such as Autism Spectrum Disorder and
Attention-Deficit/Hyperactivity Disorder. Crowdsourcing, or the use of distributed workers to collectively work
towards a larger task, is traditionally used to label training data for ML and is increasingly leveraged as a tool to
run public health studies. However, crowdsourcing has yet to be thoroughly explored as a central tool in precision
diagnostics for psychiatry. In the proposed paradigm, each crowd worker will answer targeted multiple choice
questions about each video, reducing the feature space into a socially rich feature vector corresponding to the
behaviors displayed in the video. My innovative crowdsourcing framework involves creating a quantified profile
of each crowd worker to dynamically assign them to labeling tasks based on the categories of questions they
rate in accordance with clinical experts. This crowdsourcing pipeline will be tested with respect to 3 major
components of the digital diagnostics pipeline: (1) gamified structured video data curation from each participant,
(2) behavioral feature extraction, and (3) diagnostic prediction with deep learning. The structured data collection
will occur through paired social interactions between participants who remotely interact by playing social games
on the web while their webcam and microphone record their behaviors. Crowd workers will watch the videos and
answer multiple choice questions pertaining to the subject’s behavior in the video. The crowd annotations and
metadata will be supplemented with computationally extracted eye gaze, facial emotion expression, vocal pitch,
and speech timing features. These features will be collectively used to train a deep learning model which outputs
both diagnostic categories and indicators of the presence of individual behavioral characteristics (e.g.,
hyperactivity and distractibility). Based on crowd labels of earlier games,...

## Key facts

- **NIH application ID:** 10682965
- **Project number:** 1DP2EB035858-01
- **Recipient organization:** UNIVERSITY OF HAWAII AT MANOA
- **Principal Investigator:** Peter Washington
- **Activity code:** DP2 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $94,037
- **Award type:** 1
- **Project period:** 2023-09-21 → 2024-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10682965

## Citation

> US National Institutes of Health, RePORTER application 10682965, Crowd-Powered Machine Learning to Diagnose ASD and ADHD in Adolescents from Digital Social Interactions (1DP2EB035858-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10682965. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
