This project aims to teach computers how to understand the physical properties of the world from images and videos. When humans look at a video, they intuitively understand how light reflects off surfaces and how objects move under force. However, artificial intelligence systems struggle to understand these physical rules. This project supports the development of new computer vision systems that can automatically figure out an object's three-dimensional shape, what it is made of, how it reacts to light, and how it moves, all from standard video recordings. Teaching computers this kind of physical reasoning will have major benefits for society. In healthcare, these tools can help robotic surgical systems safely navigate inside of the human body by understanding how tissues stretch and deform. In manufacturing, they can help robots handle delicate or flexible materials. The project also supports educational goals by creating new courses that teach physical understanding using artificial intelligence, providing research experiences for undergraduates, and attracting the participation of high school students and individuals from non-computing backgrounds to inspire the next generation of scientists and engineers. The technical goal of this project is to develop a generalizable machine perception framework that jointly infers three-dimensional object shape, parameters related to various physical properties of the object, and external physical entities from sparse visual inputs