Semantics-Driven Active Perception and Navigation with Aerial Robots

Liu, Xu

Semantics-Driven Active Perception and Navigation with Aerial Robots

Files

Liu_upenngdas_0175C_16765.pdf (84.02 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Mechanical Engineering and Applied Mechanics

Discipline

Electrical Engineering

Subject

Active mapping
Active metric-semantic mapping
Autonomous navigation
Metric-semantic SLAM
Multi-robot systems
UAV

Copyright date

2024

Permalink

https://repository.upenn.edu/handle/20.500.14332/60891

View all metadata

Author

Liu, Xu

Abstract

Autonomous aerial robots today are capable of safely navigating through cluttered, GPS-denied environments while constructing an accurate map that captures geometric features such as points, lines, and planes. Such maps are crucial for low-level planning and obstacle avoidance. However, beyond offering details on the density, layout, and dimensions of the environment, these maps provide limited information for semantically meaningful reasoning. For instance, they fall short in helping the robot identify where to find specific objects during search and rescue operations, which areas are relevant during infrastructure inspection or asset mapping, or in distinguishing between static and dynamic entities in the environment during localization and navigation. Such reasoning capabilities are especially important for resource-constrained robots, such as micro aerial vehicles, deployed in large-scale environments. In such scenarios, robots need to keep track of important, actionable information, while respecting their onboard computational and storage resource constraints. Sparse, semantically meaningful maps facilitate robust state estimation and storage-efficient mapping, while also enabling high-level reasoning that guides intelligent, task-relevant decision-making during navigation and exploration. This thesis introduces a set of novel methodologies and algorithms for semantics-driven perception and autonomy, enabling robots to safely navigate and explore large-scale, complex environments while actively and collaboratively constructing high-quality, semantically meaningful maps. Specifically, we first present a monocular-camera-based semantic mapping system that integrates deep learning, visual tracking, and semantic Structure from Motion (SfM) for accurate fruit detection and mapping in orchards. We then develop a semantic Simultaneous Localization and Mapping (SLAM) integrated autonomous aerial navigation system for large-scale semantic mapping in under-forest-canopy environments. This system leverages real-time semantic SLAM for accurate pose estimation and timber metric assessment. Next, we introduce active metric-semantic SLAM systems for both urban outdoor and indoor environments. These systems use semantics to guide aerial robots to explore the environment and minimize both metric and semantic uncertainties. Lastly, bridging all efforts together, we propose a decentralized metric-semantic SLAM framework for autonomous navigation and exploration with heterogeneous robot teams operating across multiple environments. Extensive real-world experiments have validated the robustness and performance of the proposed methods across multiple aerial and ground robot platforms in various environments, including multi-floor indoor spaces, urban outdoors, forests, and orchards. Field deployments further showcase the system's potential for direct application in solving important real-world problems, such as precision agriculture, forestry management, climate change mitigation, infrastructure inspection, and factory asset management. Finally, the proposed systems and algorithms are made available as open-source tools for public use.

Advisor

Kumar, Vijay

Date of degree

2024

Collection

Dissertations and Theses