Principal Investigator (Wright, Erik S.) Abstract: The rapidly increasing number of microbial genomes has revealed an enormous diversity of proteins without any known function. We still know relatively little about over half the proteins encoded in a typical bacterial genome, and traditional laboratory techniques are too time consuming to characterize even a small fraction of the observable universe of proteins. This neglected "dark proteome" potentially contains many important determinants of virulence, antibiotic resistance, and disease. Casting a light on the dark proteome is possible through comparative genomics because proteins interacting across evolutionary timescales leave behind a signature of coevolution that can be used to connect proteins of unknown function with proteins of known function. This 'guilt-by-association' analysis helps to generate hypotheses about the cellular role of unexplored proteins. Here, we develop novel methods for quantifying coevolutionary signals, and we apply these methods to an unprecedently large collection of genomes spanning the microbial tree of life. This project will result in a network of coevolving genes that we will make publicly accessible as a web tool for biomedical research. To further harness the power of comparative genomics, we will develop and deploy web applications that provide deeper insights into the universe of microbial proteins.