From "Mirror Flower, Water Moon" to Multi-task Visual Prospective Representation Learning for UAV Indoor Mapless Navigation

Chang, Yingxiu, Cheng, Yongqiang, Murray, John, Khalid, Muhammad and Manzoor, Umar (2025) From "Mirror Flower, Water Moon" to Multi-task Visual Prospective Representation Learning for UAV Indoor Mapless Navigation. Journal of Field Robotics. ISSN 1556-4967

Item Type:	Article

Abstract

Vision-based deep learning models have been widely adopted in autonomous agents, such as Unmanned
Aerial Vehicles (UAVs), particularly in reactive control policies that serve as a key component
of navigation systems. These policies enable agents to respond instantaneously to dynamic
environments without relying on pre-existing maps. However, there remain open challenges to improve
the agent’s reactive control performance: (1) Is it possible and how to anticipate future states
at the current moment to benefit control precision? (2) Is it possible and how to anticipate future
states for different sub-tasks when the agent’s control consists of both discrete classification and continuous
regression commands? Inspired by the Chinese idiom "Mirror Flower, Water Moon", this
paper hypothesizes that future states in the latent space can be learnt from sequential images using
contrastive learning, and consequently proposes a light-weight Multi-task Visual Prospective Representation
Learning (MulVPRL) framework for benefiting reactive control. Specifically, (1) This
paper leverages the advantage of contrastive learning to correlate the representations obtained from
the latest sequential images, and one image in the future. (2) This paper constructs an integrated loss
function of contrastive learning for classification and regression sub-tasks. The MulVPRL framework
outperforms the benchmark models on the public HDIN and DroNet datasets, and obtained the
best performance in real-world experiments (46.9????, 177???? ????????. SOTA 27.3????, 136????). Therefore, the
multi-task contrastive learning of the light-weight MulVPRL framework enhances reactive control
performance on a 2D plane, and demonstrates the potential to be integrated with various intelligent
strategies, and implemented on ground vehicles.
Keywords: UAV, Indoor Unknown Environment, Mapless Navigation, Contrastive Learning, Visual
Prospective Representation Learning (VPRL), Prospective Regression-aware Representation,
Prospective Classification-aware Representation

[thumbnail of Journal of Field Robotics - 2025 - Chang - From Mirror Flower Water Moon to Multi‐Task Visual Prospective Representation.pdf]

Preview

PDF
Journal of Field Robotics - 2025 - Chang - From Mirror Flower Water Moon to Multi‐Task Visual Prospective Representation.pdf - Published Version
Available under License Creative Commons Attribution.
Download (6MB) | Preview