3D Random Occlusion and Multi-layer Projection for Deep Multi-camera Pedestrian Localization



Qiu, Rui, Xu, Ming, Yan, Yuyao, Smith, Jeremy S ORCID: 0000-0002-0212-2365 and Yang, Xi
(2022) 3D Random Occlusion and Multi-layer Projection for Deep Multi-camera Pedestrian Localization. .

[img] PDF
2207.10895.pdf - Author Accepted Manuscript

Download (27MB) | Preview

Abstract

Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limited applications, due to the lack of annotated training samples in existing multi-view datasets, which increases the risk of overfitting. To address this problem, a data augmentation method is proposed to randomly generate 3D cylinder occlusions, on the ground plane, which are of the average size of pedestrians and projected to multiple views, to relieve the impact of overfitting in the training. Moreover, the feature map of each view is projected to multiple parallel planes at different heights, by using homographies, which allows the CNNs to fully utilize the features across the height of each pedestrian to infer the locations of pedestrians on the ground plane. The proposed 3DROM method has a greatly improved performance in comparison with the state-of-the-art deep-learning based methods for multi-view pedestrian detection. Code is available at https://github.com/xjtlu-cvlab/3DROM.

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: Multi-view detection, Deep learning, Data augmentation, Perspective transformations
Divisions: Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User: Symplectic Admin
Date Deposited: 01 Feb 2023 09:18
Last Modified: 01 Feb 2023 09:18
DOI: 10.1007/978-3-031-20080-9_40
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3168052