From “Mirror Flower, Water Moon” to Multi‐Task Visual Prospective Representation Learning for Unmanned Aerial Vehicles Indoor Mapless Navigation

Chang, Yingxiu, Cheng, Yongqiang, Murray, John, Khalid, Muhammad and Manzoor, Umar (2025) From “Mirror Flower, Water Moon” to Multi‐Task Visual Prospective Representation Learning for Unmanned Aerial Vehicles Indoor Mapless Navigation. Journal of Field Robotics. ISSN 1556-4967

Item Type:	Article

Abstract

Vision‐based deep learning models have been widely adopted in autonomous agents, such as unmanned aerial vehicles (UAVs), particularly in reactive control policies that serve as a key component of navigation systems. These policies enable agents to respond instantaneously to dynamic environments without relying on pre‐existing maps. However, there remain open challenges to improve the agent's reactive control performance: (1) Is it possible and how to anticipate future states at the current moment to benefit control precision? (2) Is it possible and how can we anticipate future states for different sub‐tasks when the agent's control consists of both discrete classification and continuous regression commands? Inspired by the Chinese idiom “Mirror Flower, Water Moon,” this paper hypothesizes that future states in the latent space can be learnt from sequential images using contrastive learning, and consequently proposes a light‐weight Multi‐task Visual Prospective Representation Learning (MulVPRL) framework for benefiting reactive control. Specifically, (1) This paper leverages the advantage of contrastive learning to correlate the representations obtained from the latest sequential images and one image in the future. (2) This paper constructs an integrated loss function of contrastive learning for classification and regression sub‐tasks. The MulVPRL framework outperforms the benchmark models on the public HDIN and DroNet datasets, and obtained the best performance in real‐world experiments ( 46.9 m , 177 svs . $46.9\,{\rm{m}},177\mathrm{svs}.$ SOTA 27.3 m , 136 s $27.3\,{\rm{m}},136\,{\rm{s}}$ ). Therefore, the multi‐task contrastive learning of the light‐weight MulVPRL framework enhances reactive control performance on a 2D plane, and demonstrates the potential to be integrated with various intelligent strategies, and implemented on ground vehicles.

[thumbnail of Journal of Field Robotics - 2025 - Chang - From Mirror Flower Water Moon to Multi‐Task Visual Prospective Representation.pdf]

Preview

PDF
Journal of Field Robotics - 2025 - Chang - From Mirror Flower Water Moon to Multi‐Task Visual Prospective Representation.pdf - Published Version
Available under License Creative Commons Attribution.
Download (6MB) | Preview

More Information

Uncontrolled Keywords: indoor unknown environment, prospective classification‐aware representation, visual prospective representation learning (VPRL), prospective regression‐aware representation, UAV, mapless navigation, contrastive learning

Related URLs: http://creativecommons.org/licenses/by/4...

SWORD Depositor: Publication Router

Depositing User: Publication Router