Summary (NOT-OD-21-094 Administrative Supplement to P50DA037844) In this supplement application, we are seeking funds to improve the AI/ML-readiness of data generated by the Center for GWAS in outbred rats. Since the center’s formation in 2014, we have collected extensive data on more than 8,000 genetically unique heterogeneous stock (HS) rats and have secured funding to grow that number to 16,000 by 2025. Data types include genotypes at millions of single nucleotide polymorphisms (SNPs), complex behavioral and physiological phenotypes, RNASeq, ATACSeq, single cell RNASeq, single cell ATACSeq, microbiome, and metabolomic data. While the center is focused on traits relevant to substance abuse, these datasets are much more broadly applicable. They include other behavioral traits relevant to all fields of neuroscience and physiological traits relevant to numerous organ systems and diseases. These data have been carefully curated, including numerous human and automated quality control steps, and are organized as data types available for each unique individual. However, there is no public facing description of the data, and no effort has been put into making them AI/ML-ready. In this proposal, we will improve this situation by bringing together a team with expertise in 1) this specific dataset, 2) best practices for information sharing, and 3) AI/ML for genetic applications. We will begin by bringing the group together to identify the most important and addressable shortcomings. We will then begin to address these goals, meeting frequently to monitor progress and overcome unanticipated challenges. Finally, as the work is completed, our extant network of AI/ML collaborators will perform simple AI/ML exercises to confirm that the improvements are successful. This will be an iterative process; meaning that we may revise specific action items over the course of the project in an effort to maximize impact. We anticipate that improvements will include establishing a website, and making all of our data findable. We will use protocols.io to document each research protocol, will assign RRIDs to all individuals, and will use best practices to make all data FAIR and AI/ML ready. This supplement will provide the impetus and funding to bring together an outstanding team to make sure that NIH’s investment in this unique dataset can be used for cutting edge AI/ML approaches. This project is within the scope of the parent award but does not duplicate any work already supported by the parent grant.