CAREER: Foundational 4D Human Video Understanding

NSF Award Search · 01002526DB NSF RESEARCH & RELATED ACTIVIT · $599,955 · view on nsf.gov ↗

Abstract

Understanding human behavior from video is a challenging and transformative area of research, with applications in robotics, assistive technologies, neuroscience, and beyond. Humans do not act in isolation; their actions are shaped by their surroundings, interactions with others, and the objects they use. This project aims to develop a new foundational paradigm for understanding humans in 4D — their 3D state over time — from any type of video. Unlike current methods, this approach integrates people with their physical and social context, enabling a deeper understanding of human activities. By creating a computational framework that can analyze both exo-centric (third-person) and ego-centric (first-person) videos, the project addresses the limitations of existing methods and supports downstream applications such as assistive technologies, wearable AI, and data analysis for neuroscience and practical everyday tasks. The resulting advancements will enable robots to learn from observing humans, assistive technologies to better support users, and wearable devices to provide richer context for human activity, contributing to safer, more effective, and accessible technologies with far-reaching impacts across science, industry, and society. This project will design a scalable, transformer-based model to capture the 4D state of humans and their situational context, including surrounding environments, social interactions, and object use. By leveraging recent advancements in 3D pose

Key facts

NSF award ID
2442491
Awardee
University of California-Berkeley (CA)
SAM.gov UEI
GS3YEVSS12N6
PI
Angjoo Kanazawa
Primary program
01002526DB NSF RESEARCH & RELATED ACTIVIT
All programs
CAREER-Faculty Erly Career Dev, ROBUST INTELLIGENCE
Estimated total
$599,955
Funds obligated
$341,618
Transaction type
Continuing Grant
Period
06/15/2025 → 05/31/2030