Suicidal behavior, which includes both suicide attempts and death by suicide, has become an ever-increasing public health concern. Approximately 17 Veterans die by suicide every day (1), with rates highest among younger Veterans (age 18-34). While relatively little is known about its biological basis, epidemiological studies make it clear that suicidality has a substantial heritable component, with heritability estimates of 30- 50% (2). While there is evidence that this heritability is moderated in part by a liability to psychiatric disorders, such as mood disorders, other evidence suggests heritable factors independent of psychiatric disorders (3). It has been observed that the rates of suicidal behavior are particularly high in Veterans with a Serious Mental Illness (SMI) when assessed using the Columbia Suicide Severity Rating Scale (CSSRS) (4). In the Cooperative Studies Program (CSP) #572 study, Veterans with bipolar disorder (BPI) have a suicide attempt rate of 55%, with women Veterans having the highest rate (5). This is compared to the rate of suicide attempt in the Million Veteran Program (MVP) as a whole (3.4%) (6). Recently, the International Suicide Genetics Consortium (ISGC) identified a genome-wide association signal on chromosome 7, which was independently replicated by MVP investigators (7). We are now proposing to conduct a machine learning analysis of the attempted suicide phenotype in BPI using the CSP#572 and MVP datasets. As with all machine learning studies based on electronic health record (EHR) data, one critical issue is that the information on rates of BPI diagnosis and suicidal behavior in the EHR dataset will be incomplete. To circumvent this and make our subsequent machine learning algorithm more effective, we plan to incorporate and leverage the information gained through the CSP#572 (5). Importantly, the ~5400 CSP#572 veterans with BPI were genotyped in parallel with the MVP dataset (5,8) and thoroughly phenotyped using the Structured Clinical Interview for DSM Disorders (SCID; 9) clinical assessment and the gold standard Columbia Suicide Severity Rating Scale (CSSRS; 4), generating a rich collection of relevant phenotypic data. This is a clear advantage for the machine learning project as it gives us more reliable information on the presence or absence of bipolar disorder and suicidal behavior on which to base our machine learning algorithm. In Aim 1, we propose an analysis utilizing demographic variables and clinical comorbidities to screen BPI subjects from the CSP#572 and a matched set of MVP controls for phenotypes relevant to suicidal behavior. Next, we will employ the Polygenic Risk Score (PRS) approach, which can be used to identify individuals with increased genetic loading for various diseases. Finally, we propose to generate predictive models of suicidal behavior using machine learning-based approaches and the Aim 1 sample set. We will then test these results using an independent MVP cohort and in the Uta...