Collaborative Learning for Language and Speaker Recognition

Li, Lantian, Tang, Zhiyuan, Wang, Dong, Abel, Andrew, Feng, Yang and Zhang, Shiyue (2018) Collaborative Learning for Language and Speaker Recognition. .

Access the full-text of this item by clicking on the Open Access link.

Official URL: http://dx.doi.org/10.1007/978-981-10-8111-8_6

Abstract

This paper presents a unified model to perform language and speaker recognition simultaneously and together. This model is based on a multi-task recurrent neural network, where the output of one task is fed in as the input of the other, leading to a collaborative learning framework that can improve both language and speaker recognition by sharing information between the tasks. The preliminary experiments presented in this paper demonstrate that the multi-task model outperforms similar task-specific models on both language and speaker tasks. The language recognition improvement is especially remarkable, which we believe is due to the speaker normalization effect caused by using the information from the speaker recognition component.

Item Type:	Conference or Workshop Item (Unspecified)
Uncontrolled Keywords:	Basic Behavioral and Social Science, Neurosciences, Behavioral and Social Science, 1.2 Psychological and socioeconomic processes, 1 Underpinning research, Mental health
Depositing User:	Symplectic Admin
Date Deposited:	12 Feb 2020 11:44
Last Modified:	15 Mar 2024 23:33
DOI:	10.1007/978-981-10-8111-8_6
Open Access URL:	https://arxiv.org/abs/1609.08442
Related URLs:	Publisher
URI:	https://livrepository.liverpool.ac.uk/id/eprint/3074677