Understanding and interpreting complex environments is crucial for autonomous systems to operate safely and efficiently. A self-driving vehicle must navigate through uneven terrain, a search-and-rescue drone must identify obstacles in disaster-stricken areas, and an environmental monitoring system must accurately reconstruct large-scale off-road scenes. However, existing computer vision algorithms, primarily designed for structured indoor or urban environments, often fail in these scenarios due to the unpredictable nature of off-road terrain, dynamic environmental conditions, and the scarcity of reliable visual features. This project will develop a 3D computer vision framework that fuses multiple sensing modalities, including RGB cameras, depth sensors, LiDAR, and event cameras, to enhance feature extraction, tracking, and large-scale scene reconstruction, thereby improving perception accuracy and adaptability in unstructured environments. The research will provide a foundation for next-generation autonomous perception systems, enabling significant advancements in autonomous navigation, environmental monitoring, and search-and-rescue operations. Additionally, the project will provide valuable educational opportunities by engaging students in hands-on research and promoting interdisciplinary learning in STEM. This project will introduce a framework for learning robust 3D visual representations in unstructured environments by integrating multi-modal sensing, feature extracti