Resource Informatics Core

NIH RePORTER · NIH · U41 · $250,965 · view on reporter.nih.gov ↗

Abstract

C. Informatics Core Summary The informatics core will continue to deliver to the community high quality, well-structured datasets with complete metadata along with comprehensive data analysis. To achieve this, we have developed bioinformatics pipelines to process and validate our ChIP-seq and RNA-seq data and worked extensively with the ENCODE DCC to curate our metadata to make our data easily accessible. The ChIP-seq pipeline has been used to call both narrow and broad peaks and to annotate HOT regions and TF binding sites in worm and fly across varying samples and stages; the RNA-seq pipeline has been used to identify differentially expressed genes under various conditions, such as different developmental stages and TF mutants, and we will evaluate TF binding sites associated with these genes. Although these pipelines have been set up and tested thoroughly, we aim to further optimize them; for instance, a new method is being developed to call ChIP-seq peaks using multiple types of controls. To our knowledge, no such peak caller exists. To integrate and analyze our data, we will develop a mini-encyclopedia with three levels of annotations, similar to the encyclopedia developed through the ENCODE project. The ground level will consist of the gene expression, TF binding and histone modification data in worm and fly. Based on our preliminary results, we have developed advanced statistical models to identify functional genomic regions, such as enhancers and HOT regions, etc. We will deposit these results into the middle annotation level. The top level will contain linkages of genes and their regulators, predicted by our models. The regulators include both cis- and trans-regulatory elements, such as enhancers and TFs. Moreover, the linkages will be integrated to form temporal or spatial networks. We aim to identify key regulatory factors by comparing the structure of the networks. We will share all of our datasets, analysis results, and worm and fly strains with the community through the appropriate public databases.

Key facts

NIH application ID
10136671
Project number
5U41HG007355-08
Recipient
UNIVERSITY OF WASHINGTON
Principal Investigator
ROBERT H WATERSTON
Activity code
U41
Funding institute
NIH
Fiscal year
2021
Award amount
$250,965
Award type
5
Project period
2013-09-20 → 2024-03-31