PROJECT SUMMARY / ABSTRACT Gene expression pattern is determined by the complex network over cis-regulatory elements and trans-acting factors. RNA-seq quantification of allele-specific expression and genotype data from matched individuals provide opportunities to understand how this gene regulatory network is wired and modified by genetic variants. So far, analyses of such datasets have been performed only on a single-gene basis, ignoring the complex network over many interacting genes, and only with data collected in batch, treating each sample as equally valuable, even though RNA-seq and genome sequence data from each sample are informative only in specific circumstances. To address these limitations, we propose to combine allele-specific expression quantitative trait locus (eQTL) mapping with genetical genomics approach to reconstruct gene networks by treating genetic variants as naturally- occurring perturbations of allele-specific expression and to actively guide the data collection process to efficiently capture the most informative naturally occurring perturbations in data. The computational framework we propose to develop is the first to address this problem and will include 1) probabilistic graphical models for representing and learning gene networks perturbed by cis- and trans-acting eQTLs and 2) active sample selection algorithms for assessing for which samples to collect additional RNA-seq or genotype data and updating the current network model with new samples. We will apply our computational technique to simulated, mouse intercross, and the eQTLGen Consortium data to reconstruct gene networks perturbed by genetic variants and to compare the performance of active and batch learning strategies. In particular, we will explore the possibilities of implementing active data collection strategy in rodent studies in a collaborative research between a computational biologist and a mouse geneticist. The proposed research will provide biomedical researchers with a general computational framework for unraveling the gene regulatory mehanisms and cis-/trans-acting eQTLs that give rise to diseases with cost-effective data collection strategies.