The munich LSTM-RNN approach to the MediaEval 2014 "Emotion in Music" Task



Coutinho, E ORCID: 0000-0001-5234-1497, Weninger, F, Schuller, B and Scherer, KR
(2014) The munich LSTM-RNN approach to the MediaEval 2014 "Emotion in Music" Task. .

[img] Text
MISP-TUM-paper-cameraready.pdf

Download (146kB)

Abstract

In this paper we describe TUM's approach for the MediaEval's \Emotion in Music" task. The goal of this task is to automatically estimate the emotions expressed by music (in terms of Arousal and Valence) in a time-continuous fashion. Our system consists of Long-Short Term Memory Recurrent Neural Networks (LSTM-RNN) for dynamic Arousal and Valence regression. We used two di erent sets of acoustic and psychoacoustic features that have been previously proven as e ective for emotion prediction in music and speech. The best model yielded an average Pearson's correlation coe-cient of 0.354 (Arousal) and 0.198 (Valence), and an average Root Mean Squared Error of 0.102 (Arousal) and 0.079 (Valence).

Item Type: Conference or Workshop Item (Unspecified)
Depositing User: Symplectic Admin
Date Deposited: 11 Apr 2016 08:13
Last Modified: 09 Jan 2021 08:33
URI: https://livrepository.liverpool.ac.uk/id/eprint/3000423