Project Summary Transposable elements (TEs) comprise roughly half of the human genomes, and some TE subfamilies can even contain hundreds of thousands of copies, such as LINE and SINE elements. Highly enriched transcriptional factor binding sites in the TEs sequence enable TEs the huge regulatory potential to the host genome. Mounting evidence suggests some of TEs escaped from epigenetic silencing and actively involved in multiple biological processes of host genome. TEs are significant contributors to the origin of vertebrate long non-coding RNAs, and some TEs are also found to play roles as promoters in early development and some terminally differentiated tissues. Our recent study found that domesticated rodent-specific TEs can play roles as promoters to initiate the tissue-specific transcription of more than 300 genes during mouse tissue differentiation. However, how the domesticated TEs-derived promoters in the human genome to regulate the gene transcription in distinct tissues and cell types, is not clearly characterized. For example, we do not know how many genes can be transcribed by TEs-derived promoters in particular human tissues; we have no idea about the usage of TEs-derived promoters in the different cell types from the same tissue; finally, how domesticated TEs in the human genome created novel tissue-specific expression pattern of conserved protein- coding genes, is still mystified. Thus, in this proposed project, we will focus on investigating the tissue- and cell type-specific gene transcription controlled by the domesticated TEs-derived promoters in the human genome. Leveraging the big data generated by large consortiums, e.g., ENCODE, Roadmap Epigenomics, GTEx, and Human Cell Atlas, we will perform a systematic survey of the usage of domesticated TEs-derived promoters in the human genome. Firstly, we will identify the TEs that were domesticated as promoters of protein-coding genes and non-coding genes in the human genome, and further characterize the tissue-level expression pattern of domesticated TEs-derived transcripts, by using our established transcripts assemble pipeline to analyze the tissue bulk RNA-seq data generated by ENCODE and GTEx. Secondly, we will investigate the cell-type-specific expression pattern of domesticated TEs-derived transcripts, by reconstructing the single-cell RNA-seq data analysis with novel bioinformatics analysis tool. Finally, we will apply comparative-genomics approaches to create the expression matrix of orthologous TEs-derived protein- coding genes across multi-species, and construct the phylogenetic trees to explore the expression pattern changes of TEs-derived protein-coding genes during evolution.