PROJECT SUMMARY Transcriptional cis-regulatory elements (CREs), such as enhancers and promoters, play an essential role in all biological processes by controlling the expression of their target genes. Sequence variants in these CREs can perturb their target gene expression by altering the binding of transcription factors (TF). It is now clear that the substantial risk is encoded within these noncoding regulatory variants in most human disorders. However, systematic identification of regulatory variants and their causative transcriptional machinery for human diseases remains challenging. Over the past decade, we have pioneered to solve these important problems and have made significant progress in developing machine-learning-based methods to predict CREs (gkm-SVM) and regulatory variants (deltaSVM) from DNA sequence. We recently demonstrated that these regulatory variants predicted by deltaSVM significantly contribute to the heritability of human traits and diseases in a tissue- and cell-specific way. Here, we will extend these methodologies to further improve the discovery of regulatory variants in the human genome and explore their contribution to human diseases and traits. Toward this end, we will employ a two-step training approach. We will first build multiple sequence-based models to predict regulatory variants trained on a compendium of genomic data. We will then train ensemble models to find optimal combinations of these models to predict experimentally identified regulatory variants that exhibit allelic imbalance in chromatin accessibility. Uniquely, we will build this model in a cell-type resolved manner using human kidney single-cell chromatin accessibility data. Next, we will systematically assess these models using a broad range of human traits and diseases from well-powered genome-wide association studies (GWAS). We will then computationally identify targeted genes of these predicted regulatory variants and prioritize genes based on their contribution to traits and diseases relevant to tissues and cells using co-localization analyses. Lastly, we will experimentally validate these putative regulatory variants with massively parallel reporter assays and their predicted target genes with multiple CRE deletion experiments using CRISPR-cas9. As an exemplar, we will focus on kidney traits and use kidney relevant cell lines for these validation experiments. Our framework will enable us to further improve regulatory variation discovery and ultimately help us better understand how gene regulatory mechanisms are perturbed in human diseases and trait variation.