Project Summary Complex disease and traits are caused by dynamic genetic regulation and environmental interactions. Numerous genetic, genomic, and phenotypic datasets have been generated, including genotypes, gene expression, epigenetic changes, and electronic medical records (EMRs). Currently, there is main challenge on development of novel informatic approaches to effectively link phenotype with genomic information. Specifically, genome-wide association studies (GWAS) have reported several thousand single nucleotide polymorphisms (SNPs) that are significantly associated with the disease and traits; however, more than 80% of them are noncoding variants, making it difficult to interpret their potential disease-causal roles. We and others have systematically examined how phenotypic variability in disease risk for a broad spectrum of disease phenotypes can be explained by regulatory variants. Now, we hypothesize that such regulation will be in a tissue-specific, cell type-specific and developmental stage-specific (TCD-specific) manner. Importantly, large genomic consortia, like ENCODE, FANTOM5, the Roadmap Epigenomics, and GTEx have continuously generated high-quality functional data for annotating genome-wide variants. The emerging single-cell sequencing technologies have enabled us to examine how genetic variants affect cellular functions within individual cells or specific cell types. This brings us an unprecedented opportunity to develop novel statistical and computational approaches for deep understanding of the genetic architecture of phenotype. In this proposal, we combine bioinformatics, single cell omics, deep learning, and phenotype and EMR data mining to develop novel analytical strategies that maximally leverage information from both genotype and expression from massive heterogeneous data, aiming to predict phenotype by functional assessment of DNA variation at the TCD-specific levels. We propose the following three specific aims. (1) To develop a deep learning method for variant impact predictor, DeepVIP, that maximally utilizes functional and regulatory data to predict the causal roles of variants in complex disease and traits. (2) To develop phenotype-specific network approaches to resolve genotype-phenotype relationships in the spatiotemporal manner and single-cell resolution. We will develop a novel method, single cell dense module search of GWAS signals (scGWAS) and also a graphical neural network approach, GNN-scTP, to detect driving roles of genes from single cell RNA-seq data. These methods can effectively identify critical regulatory modules and genes in complex disease in the TCD-specific manner. (3) To apply the methods to 16 neurodevelopmental and neurodegenerative disorders and related traits, as well as broad phenotypes using Vanderbilt biobank (BioVU) and UK Biobank data – both have genotypes linked with rich phenotypic information. Our proposal is timely and innovative to study the genetic architecture in human complex ...