PROJECT SUMMARY/ABSTRACT Suicide ranks as the second most frequent cause of death in adolescence, and the rate of suicide among adolescents has continued to increase. Despite 50 years of research and efforts, the prediction of suicide and suicidal thoughts and behaviors (STBs) remains difficult. Recent studies indicate that electronic health record (EHR) data analytics can help predict the risk of STB. The Research Domain Criteria (RDoC) provides a framework to probe transdiagnostic domains reflecting the positive valence, the negative valence, and the sleep-wakefulness element insomnia within the arousal and regulatory domain. These three RDoC constructs have shown strong association with depression and anxiety disorders. However, there exists a knowledge gap regarding the relationship and impact of RDoC measures extracted from EHR on youth suicide attempts (SAs). Given significant changes in positive affect and cognitive systems during childhood and adolescence, our overall goal is to assess the RDoC positive valence, the negative valence, and the sleep-wakefulness element insomnia in youth with STBs using machine learning and deep learning based natural language processing of EHR data. This study will leverage the effort and resources that have been invested in previous projects from two sites: a study on SA prediction using natural language processing and machine learning from EHR data (n=7,670 youths) in the University of Pittsburgh Medical Center (UPMC) hospitals; and the data collection for SA study in the Children’s Hospital of Philadelphia (CHOP) with 567,091youths (n=3,125 attempters). The specific aims are to 1) develop and validate extraction of summary variables from EHR using deep neural network language models for the positive valence, the negative valence, and the sleep-wakefulness element insomnia within the arousal and regulatory domain; 2) compare performance of the ML models developed in Aim 1 to extract the RDoC positive and negative valence and insomnia from EHR with traditional NLP approaches; and 3) test utility of the RDoC positive and negative valence and insomnia in prediction of suicidal behaviors. This proposed study, if successful, is the first steps towards other RDoC domains and constructs extracted from EHR on youth SAs and translating the obtained models to clinical settings. Dr. Tsui (CHOP) and Dr. Ryan (UPMC) have a long history of collaboration in mental health studies using ML and NLP and have strong experience in serving as PIs in various studies. Overall, our study has a potential to advance the field of SA prediction that facilitates timely intervention and ultimately reduces youth suicides.