Automatically estimating emotion in music with deep long-short term memory recurrent neural networks



Coutinho, E ORCID: 0000-0001-5234-1497, Trigeorgis, G, Zafeiriou, S and Schuller, B
(2015) Automatically estimating emotion in music with deep long-short term memory recurrent neural networks. .

[img] Text
ICL-me15em-paper-v4-cameraready.pdf - Unspecified

Download (147kB)

Abstract

In this paper we describe our approach for the MediaEval's "Emotion in Music" task. Our method consists of deep Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression, using acoustic and psychoacoustic features extracted from the songs that have been previously proven as effective for emotion prediction in music. Results on the challenge test demonstrate an excellent performance for Arousal estimation (r = 0.613 ± 0.278), but not for Valence (r = 0.026 ± 0.500). Issues regarding the quality of the test set annotations' reliability and distributions are indicated as plausible justifications for these results. By using a subset of the development set that was left out for performance estimation, we could determine that the performance of our approach may be underestimated for Valence (Arousal: r = 0.596 ± 0.386; Valence: r = 0.458 ± 0.551).

Item Type: Conference or Workshop Item (Unspecified)
Depositing User: Symplectic Admin
Date Deposited: 18 Apr 2016 14:54
Last Modified: 13 Oct 2021 04:10
URI: https://livrepository.liverpool.ac.uk/id/eprint/3000593