Enhancing Autonomous Vehicle Perception: A Focus on Embedding Learning and Uncertainty Analysis

Cai, Kaiwen (2024) Enhancing Autonomous Vehicle Perception: A Focus on Embedding Learning and Uncertainty Analysis. Doctor of Philosophy thesis, University of Liverpool.

Text
201506247_Mar2024.pdf - Author Accepted Manuscript
Download (27MB) | Preview

Abstract

Perception plays a critical role in autonomous driving, encompassing key areas such as place recognition, semantic segmentation, and object detection. These areas have been extensively researched in the literature. Additionally, embedding learning forms the bedrock of these recognition tasks. It operates by developing a mapping function that arranges similar objects in close proximity within an embedding space. However, embedding learning is primarily based on Deep Neural Networks (DNNs). This reliance poses challenges in establishing trust in the learnt embeddings, especially in autonomous driving—a scenario sensitive to risks. This thesis focuses on enhancing the reliability and reducing the risks associated with embedding learning techniques in autonomous driving. We first focused on improving place recognition, a task of identifying previously visited places. Existing place recognition methods are susceptible to unfavorable environmental conditions due to the employment of visual cameras. To tackle this limitation, we leveraged a new millimeter wave sensor and developed several novel modules to make the most of it. These modules include eliminating objects based on radial velocities, spatial-temporal encoding of radar characteristics, and dynamic re-ranking of candidates through radar cross section measurements. Together, these proposed modules increase the accuracy and reliability of embedding learning in autonomous driving scenarios. In addition to hardware considerations, we also address the issue of minimizing risk from the model aspect. Our focus in the second part of this thesis is on estimating uncertainty in a common RGB-image-based place recognition task. To accomplish this, we introduce a teacher-student framework for embedding learning, in which networks share the same architecture and training data. The student net learns to generate well-calibrated uncertainty while maintaining predictive performance. Estimated uncertainty helps to identify unreliable predictions and thus reduce the risk of embedding learning. In the third section, we shift our focus from instance retrieval to object retrieval. We demonstrate the significance of point-level embedding learning in two dense prediction tasks: 3D semantic segmentation and geometric feature learning. We found that dense prediction generally is based on embedding learning network. Therefore, we propose the use of cross-point embeddings for uncertainty estimation in 3D dense prediction. By establishing a probabilistic embedding model and enforcing metric alignments in the embedding space, we obtain well-calibrated uncertainty estimation in these tasks. In the last section, rather than estimate uncertainty for prediction, we produce prediction based on the estimated uncertainty. We present Risk Controlled Image Retrieval (RCIR), which generates retrieval sets that are guaranteed to contain the true nearest neighbor with a user-specified probability. This is inspired by the fact that though uncertainty quantification can help by assessing uncertainty for query and database images, it can provide only a heuristic estimate rather than a guarantee. RCIR can be easily plugged into any uncertainty-aware image retrieval pipeline, agnostic to data distribution and model selection. In conclusion, this thesis presents a suite of solutions designed to bolster reliability and reduce risks in real-world embedding learning applications within autonomous driving. These applications span place recognition, 3D semantic segmentation, and geometric feature learning. The effectiveness of these proposed methods has been rigorously evaluated across multiple datasets, demonstrating their superiority over existing techniques.

Item Type:	Thesis (Doctor of Philosophy)
Divisions:	Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User:	Symplectic Admin
Date Deposited:	16 Apr 2024 09:27
Last Modified:	16 Apr 2024 09:28
DOI:	10.17638/03179131
Supervisors:	Huang, Xiaowei Lu, Xiaoxuan
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3179131