IMP: Software for Hybrid Determination of Macromolecular Assembly Structures

NIH RePORTER · NIH · R01 · $342,380 · view on reporter.nih.gov ↗

Abstract

Project Summary The broad goal is to develop and apply computational methods for building structural models of proteins and their assemblies. One successful approach, integrative structure modeling, casts the building of such models as a computational optimization problem where knowledge about the assembly is encoded into the scoring function used to evaluate candidate models. We propose to extend and enhance the Integrative Modeling Platform (IMP) program that provides programmatic support for developing and distributing integrative structure modeling pro- tocols. IMP already allows representing molecules at multiple resolutions, using spatial restraints from many types of data, and searching for solutions by a variety of sampling algorithms. So far, it has been applied mostly to data from electron microscopy (EM), mass spectrometry, small angle X-ray scattering, Förster resonance energy transfer, crosslinking, hydrogen deuterium exchange (HDX), and various proteomics methods. IMP is easily extensible to add support for new data sources and algorithms, and is distributed under an open source license. Here, we propose to extend IMP to address a greater range of biological problems and make it more generally useful to the scientific community. Specifically, in Aim 1, we will develop integrative threading for computing an atomic model based on a density map determined at medium resolution (4-8 Å) by EM or X- ray crystallography. This goal will be achieved by simultaneously sampling both threading and conformation based on the density map as well as other data, such as chemical cross-links and HDX protection factors. This method is significant because it will produce atomic resolution models from medium-resolution maps determined by either EM or X-ray crystallography. In Aim 2, we will develop a Bayesian integrative method for modeling ensembles of similar systems. Data from different samples, ensembles, and/or variants are often pooled together to model a single representative structure. This synthesis is problematic when the variation between the actual structures across the samples, ensembles, and variants is larger than the uncertainty of the data. We will address the challenge by developing a general and flexible scheme for representing and scoring related structural en- sembles. This method is significant because it will improve the accuracy of the model and the estimate of its uncertainty. In Aim 3, we will maximize the impact of IMP on the community, by delivering a well-tested and maintained software package that is documented with mailing lists, examples, and demonstrations at local and external workshops, by hosting select users at UCSF, and by pursuing closer integration with other software packages and community resources, including databases such as the Protein Data Bank (PDB), structure view- ers such as Chimera and VMD, and other modeling programs such as NAMD and ReaDDy. The proposed aims are informed by and will shape the nascent w...

Key facts

NIH application ID: 10693199
Project number: 5R01GM083960-15
Recipient: UNIVERSITY OF CALIFORNIA, SAN FRANCISCO
Principal Investigator: ANDREJ SALI
Activity code: R01
Funding institute: NIH
Fiscal year: 2023
Award amount: $342,380
Award type: 5
Project period: 2008-04-01 → 2025-08-31