For the past 18 years, CASAH has followed 340 predominantly Black and Latino/a youth with perinatally acquired HIV (PHIV) and youth perinatally HIV exposed but infected (PHEU) – enrolled at ages 9-16 years from vulnerable communities in New York City – documenting health risk and resilience across childhood, adolescence, and emerging adulthood. Now in its fourth competing continuation (R01 MH69133-19), CASAH4, as it is known, is following this cohort through young adulthood (YA; 20s-early 30s). CASAH has made significant contributions to research on health risk and resilience and the social determinants of these outcomes in PHIV and PHEU youth with over 125 publications and 80 scientific presentations, and has directly informed mental-health and HIV- prevention interventions and service systems in the US and abroad. CASAH is guided by Social Action Theory (SAT), and CASAH4 specifically examines: 1) the impact of HIV infection on behavioral health outcomes (e.g., mental health, sexual risk, substance use, adherence) and achievement of adult milestones (e.g., education, vocation, independence); 2) how SAT-informed risk and protective factors affect YA behavioral health and achievement of adult milestones; 3) trajectories of behavioral health across adolescence and young adulthood and SAT-informed predictors of these trajectories; and 4) behavioral health outcomes and their SAT-informed predictors among youth across global cohorts (e.g. US, Thailand, South Africa). CASAH4 has added biomedical health indicators (inflammation and immune activation biomarkers associated with psychiatric and neurocognitive function) to HIV RNA viral load and CD4+ cell count data already collected. With its ongoing and historical work, CASAH has one of the most comprehensive longitudinal datasets on mental health, alcohol/substance use, behavioral health, as well as milestone achievement among adolescents and young adults living with PHIV or PHEU – a dataset ideal for machine learning (ML) and sharing with other multidisciplinary scientists. However, gaps in the full CASAH dataset (e.g., unprocessed data, unscaled variables) and in data documentation act as barriers to using it for ML and sharing it via NIH-supported data repositories. Thus, the aims of this one-year ML data readiness administrative supplement are to prepare (collate, clean, document, test and share) the full CASAH dataset (which will ultimately span 20 years and 10 waves of data collection) for use in ML applications and sharing with a range of multidisciplinary scientists in accordance with the NIH goals to advance our knowledge of health outcomes and social determinants of health in vulnerable young people affected by HIV. In this proposal, we add expertise in ML data analytic approaches, data security and data transportability. Ensuring the transportability of CASAH data will allow for cross-cohort studies with non-HIV populations (e.g., The National Longitudinal Study of Adolescent to Adult Health; ...