# RI: Small: Empowering Longer Video Understanding via Token Compression, Selection, and Reasoning

> **NSF 01002526DB NSF RESEARCH & RELATED ACTIVIT** · University of Illinois at Urbana-Champaign (IL) · $600,000

## Abstract

This project aims to advance how machines interpret video content by developing new capabilities for analyzing extended video streams; that is, ranging from several minutes to multiple hours, which is far beyond the short clips most current systems are designed to handle. As videos continue to dominate digital communication and information sharing, the ability to understand video over extended timescales is becoming increasingly essential. This research will support both live and recorded formats and encompass a broad spectrum of video sources, including footage from wearable, mobile, and fixed cameras. By equipping intelligent systems with the capacity to comprehend complex, time-varying visual information, the project is expected to drive progress in real-world applications such as interactive assistance, autonomous navigation, augmented reality, and content summarization.

The primary technical challenge addressed by this project is the extreme data volume inherent in long video sequences, which can produce millions of representational units -- known as tokens -- when processed by modern vision-language models based on transformer architectures. This exceeds the context length limits of current models and hinders effective reasoning over long time horizons. To overcome these limitations, the project proposes a novel framework centered on token selection and context-aware representation. Instead of encoding entire video streams, the system will prioritize a small, highly 

## Key facts

- **NSF award ID:** 2519216
- **Awardee organization:** University of Illinois at Urbana-Champaign (IL)
- **SAM.gov UEI:** Y8CWNJRCNN91
- **PI:** Yuxiong Wang
- **Primary program:** 01002526DB NSF RESEARCH & RELATED ACTIVIT
- **All programs:** SMALL PROJECT, ROBUST INTELLIGENCE
- **Estimated total:** $600,000
- **Funds obligated:** $600,000
- **Transaction type:** Standard Grant
- **Period:** 09/01/2025 → 08/31/2028

## Primary source

NSF Award Search: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2519216

## Citation

> US National Science Foundation, Award 2519216, RI: Small: Empowering Longer Video Understanding via Token Compression, Selection, and Reasoning. Retrieved via AI Analytics 2026-06-07 from https://api.ai-analytics.org/grant/nsf/2519216. Licensed CC0.

---

*[NSF Awards dataset](/datasets/nsf-awards) · CC0 1.0*
