Zhang, Zixing, Ringeval, Fabien, Dong, Bin, Coutinho, Eduardo ORCID: 0000-0001-5234-1497, Marchi, Erik and Schuller, Bjoern
(2016) ENHANCED SEMI-SUPERVISED LEARNING FOR MULTIMODAL EMOTION RECOGNITION. In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016-3-20 - 2016-3-25.

[img] Text
paper.pdf - Author Accepted Manuscript

Download (232kB)


Semi-Supervised Learning (SSL) techniques have found many applications where labeled data is scarce and/or expensive to obtain. However, SSL suffers from various inherent limitations that limit its performance in practical applications. A central problem is that the low performance that a classifier can deliver on challenging recognition tasks reduces the trustability of the automatically labeled data. Another related issue is the noise accumulation problem - instances that are misclassified by the system are still used to train it in future iterations. In this paper, we propose to address both issues in the context of emotion recognition. Initially, we exploit the complementarity between audio-visual features to improve the performance of the classifier during the supervised phase. Then, we iteratively re-evaluate the automatically labeled instances to correct possibly mislabeled data and this enhances the overall confidence of the system's predictions. Experimental results performed on the RECOLA database demonstrate that our methodology delivers a strong performance in the classification of high/low emotional arousal (UAR = 76.5%), and significantly outperforms traditional SSL methods by at least 5.0% (absolute gain).

Item Type: Conference or Workshop Item (Unspecified)
Uncontrolled Keywords: Multimodal emotion recognition, enhanced semi-supervised learning
Depositing User: Symplectic Admin
Date Deposited: 27 Jun 2016 15:44
Last Modified: 19 Jan 2023 07:35
DOI: 10.1109/icassp.2016.7472666
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3001851