# A Data Science Framework for Empirically Evaluating and Deriving Reproducible and Transferrable RDoC Constructs in Youth

> **NIH NIH R01** · NEW YORK STATE PSYCHIATRIC INSTITUTE DBA RESEARCH FOUNDATION FOR MENTAL HYGIENE, INC · 2022 · $660,324

## Abstract

This project provides a data science framework and a toolbox of best practices for systematic
and reproducible data-driven methods for validating and deriving RDoC constructs with
relevance to psychopathology. Despite recent advances in methods for data-driven constructs,
results are often hard to reproduce using samples from other studies. There is a lack of
systematic statistical methods and analytical design for enhancing reproducibility. To fill this
gap, we will develop a data science framework, including novel scalable algorithms and
software, to derive and validate RDoC constructs. Although the proposed methods will
generally apply to all RDoC domains and constructs, we focus specifically on furthering
understanding of the RDoC domains of cognitive control (CC) and attention (ATT) constructs
implicated in attention deficit disorder (ADHD) and obsessive-compulsive disorder (OCD). Our
application will use multi-modal neuroimaging, behavioral, and clinical/self-report data from
large, nationally representative samples from the on Adolescent Brain Cognitive Development
(ABCD) study and multiple local clinical samples with ADHD and OCD. Specifically, using the
baseline ABCD samples, in aim 1, we will apply and develop methods to assess and validate the
current configuration of RDoC for CC and ATT using confirmatory latent variable modeling. We
will implement and develop new unsupervised learning methods to construct new
computational-driven, brain-based domains from multi-modal image data. In Aim 2, We will
introduce network analysis (via Gaussian graphical models) to characterize heterogeneity in the
interrelationship of RDoC measurements due to observed characteristics (i.e., age and sex). We
will further model the heterogeneity of the population due to unobserved characteristics by
introducing the data-driven precision phenotypes, which are the subgroup of participants with
similar RDoC dimensions. We propose a Hierarchical Bayesian Generative Model and scalable
algorithm for simultaneous dimension reduction and identify precision phenotypes. The model
also serves as a tool to transfer information from the community sample ABCD to local clinical
enriched studies. In aim 3, we will utilize the follow-up samples from ABCD and local clinical
enriched data sets to validate the results from Aims 1 and 2 and assess the clinical utility of the
precision phenotypes in predicting psychological development in follow-up time. Our project
will provide a suite of analytical tools to validate existing RDoC constructs and derive new,
reproducible constructs by accounting for various sources of heterogeneity.

## Key facts

- **NIH application ID:** 10441499
- **Project number:** 5R01MH124106-03
- **Recipient organization:** NEW YORK STATE PSYCHIATRIC INSTITUTE DBA RESEARCH FOUNDATION FOR MENTAL HYGIENE, INC
- **Principal Investigator:** SEONJOO LEE
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $660,324
- **Award type:** 5
- **Project period:** 2020-09-01 → 2025-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10441499

## Citation

> US National Institutes of Health, RePORTER application 10441499, A Data Science Framework for Empirically Evaluating and Deriving Reproducible and Transferrable RDoC Constructs in Youth (5R01MH124106-03). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10441499. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
