Multi-Task Learning with Convolutional Neural Networks

Xia, Y (2019) Multi-Task Learning with Convolutional Neural Networks. PhD thesis, University of Liverpool.

Text
200980239_Mar2019.pdf - Unspecified
Access to this file is embargoed until Unspecified.
Download (17MB)

Abstract

The CNN have achieved excellent performance in basic computer vision issues, such as, recognition and detection. However, the CNN is still an immature method, especially on multi-output classification. In traditional machine learning, the classic solution is MTL. The MTL was proposed early and has been an active topic. But, joint research on MTL and CNN are rarely mentioned. Fortunately, there is a successful integration of MTL and NN. And CNN is a typical NN. Especially, CNN is designed for computer vision. Based on the above situation, the mainly contributions of this thesis is the following three parts. Firstly, MTL and CNN is applied to face occlusion detection. This is the first time that MTL and CNN is used for detecting occluded face. The framework adopted the coarse-to-fine strategy, which consists of two CNNs. The first net is a region-based CNN detecting the head from a person upper body image while the second net is a multi-task CNN distinguishing which facial part is occluded from a head image. The experiment results prove that CNN can be integrated with MTL well. Secondly, MTL and CNN is used to jointly recognize vehicle logos and predict their attributes.In view of improving the performance of tasks, two MTL schemes, namely the adaptive weighted task learning and the switchable task learning, are proposed. To verify the algorithm, a large and realistic vehicle logo attributes dataset is prepared, which includes fifteen brands, labeled with six visual attributes and three no-visual attributes. Extensive experiments are conducted in two scenarios, equal priority learning and unequal priority learning, with promising accuracies. Thirdly, we propose a principled approach to design a evolutional tree-like multi-task deep learning framework which can be conveniently connected behind any well-known multi-class classification network and further improve their performance. Our approach starts with a basic multi-class deep architecture and dynamically deepens it during training using a criterion that groups similar tasks together. Extensive evaluation on multi-class classification datasets (MNIST and Cifar10) and multi-label prediction datasets (Berkeley Attributes of People dataset and CelebA) suggests that the models produced by the proposed method outperforms the strong baseline.

Item Type:	Thesis (PhD)
Uncontrolled Keywords:	MTCNN, FMTCNN, WMTCNN, CMTCNN, EMTCNN
Divisions:	Faculty of Science and Engineering > School of Electrical Engineering, Electronics and Computer Science
Depositing User:	Symplectic Admin
Date Deposited:	10 Jul 2019 13:41
Last Modified:	19 Jan 2023 00:57
DOI:	10.17638/03034172
Supervisors:	Zhang, Bailing
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3034172