Project Summary Cryogenic electron microscopy (cryo-EM) has emerged as a major experimental technology to determine protein structures as it reached atomic resolution (1.2-4Å) in recent years. Compared to traditional techniques (i.e., X- ray crystallography and nuclear magnetic resonance), cryo-EM has the unique capability of determining the quaternary structures of large protein complexes and assemblies difficult or impossible for them to handle. The advance of cryo-EM technology has stimulated a revolution in structural biology of studying large protein complexes and assemblies that cannot be well studied before. However, the computational reconstruction of protein structures from cryo-EM image data is still a time-consuming, labor-intensive, error-prone, and often inaccurate process, due to the bottleneck in picking protein particles in cryo-EM images, substantial noise in 3D cryo-EM density maps generated from particle images, and lack of automated and accurate methods to build protein structures from density maps. We plan to develop advanced deep learning methods to reconstruct protein structures automatically and accurately from cryo-EM images data, leveraging the large amount of high- resolution cryo-EM data accumulated in the field and the latest advances in the deep learning technology. We will develop 2D transformer networks built on top of the attention mechanism that perform better than traditional convolutional and recurrent neural networks in image processing to pick single protein particles accurately and automatically in cryo-EM image data via a novel combination of unsupervised and supervised learning. Moreover, we formulate the problem of denoising 3D cryo-EM density maps generated from 2D particle images as a novel machine learning problem and will develop both 3D deep autoencoders and rotation- /translation-equivariant transformer networks to remove noise in cryo-EM density maps. Furthermore, we will develop end-to-end 3D rotation-/translation-equivariant networks to directly identify the backbone atoms of proteins from 3D density maps without using any known structure as template, which will be used by a novel hidden Markov model to build the high-resolution full-atom structures of any protein. The methods will be rigorously evaluated on the large amount of cryo-EM data and compared with existing methods. All these methods will be integrated together to create a fully automated machine learning pipeline, the first of its kind in the field, to reconstruct protein structures more accurately from cryo-EM images than existing methods. We will implement the individual deep learning methods as well as the entire pipeline as open-source packages released at GitHub for the community to use. We will further validate the tools and pipeline by applying them to the new cryo-EM data of a group of important membrane protein complexes (i.e., ion channels) to be generated at the Brookhaven National Laboratory.