PROJECT SUMMARY MiRNAs play important roles in diverse biological processes and their dysregulation can lead to human diseases. A critical step in the control of miRNA expression is the processing of the hairpin-containing primary miRNA transcripts (pri-miRNAs), catalyzed by protein complexes consisting of RNase III enzyme DROSHA, its partner DGCR8 and other proteins. Pri-miRNAs possess cis- structural and sequence determinants to license them for processing, the disruption of which, such as by human single-nucleotide polymorphisms (SNPs), can affect processing. However, the existing cis-regulatory rules are insufficient to adequately explain the processing of all canonical miRNAs, and currently there is a lack of a method that can perform sequence-based prediction of pri-miRNA processing efficiency. Likewise, the impact of human genetic variations on pri-miRNA processing is poorly understood. In addition to miRNAs, the pri-miRNA processing machinery is also known to cleave some hairpin-containing messenger RNAs and long noncoding RNAs, regulating their abundance and splicing. However, there is no method that can predict which RNAs can be processed by DROSHA. This interdisciplinary proposal will develop first-of-its-kind sequence-based methods to quantitatively predict the processing efficiency of mammalian pri-miRNAs. We will also utilize these tools toward predicting the influence of human SNPs on pri- miRNA processing and non-miRNA DROSHA substrates. Leveraging on the computational and experimental expertise of the two PIs, four aims will be carried out to achieve the overall goal. In the first aim, we will use a combined computational and experimental approach to develop four quantitative models that generate sequence-based predictions of pri-miRNA processing efficiency, which incorporate the influence of hairpin- flanking sequences largely ignored to date. In the second aim, we will predict and validate the effects of human single-nucleotide polymorphisms on pri-miRNA processing. In the third aim, we will predict and evaluate non- miRNA DROSHA substrates. In the fourth aim, we will develop a user-friendly online database for easy community access of our predictions. The methods and database generated from this proposal will fill an existing knowledge gap by providing quantitative predictions of pri-miRNA processing, the impact of SNPs in the regulation of processing, and non-miRNA transcripts processed similar to pri-miRNAs. These tools and results can fuel further studies by the community, e.g., to study cis- and trans- regulation of pri-miRNA processing, to evaluate the functions of genetic variants around miRNA loci, and to study mechanisms of gene expression control.