Dynamic speech emotion recognition with state-space models

Konstantin Markov, Tomoko Matsui, Francois Septier, Gareth Peters

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

Automatic emotion recognition from speech has been focused mainly on identifying categorical or static affect states, but the spectrum of human emotion is continuous and time-varying. In this paper, we present a recognition system for dynamic speech emotion based on state-space models (SSMs). The prediction of the unknown emotion trajectory in the affect space spanned by Arousal, Valence, and Dominance (A-V-D) descriptors is cast as a time series filtering task. The state space models we investigated include a standard linear model (Kalman filter) as well as novel non-linear, non-parametric Gaussian Processes (GP) based SSM. We use the AVEC 2014 database for evaluation, which provides ground truth A-V-D labels which allows state and measurement functions to be learned separately simplifying the model training. For the filtering with GP SSM, we used two approximation methods: a recently proposed analytic method and Particle filter. All models were evaluated in terms of average Pearson correlation R and root mean square error (RMSE). The results show that using the same feature vectors, the GP SSMs achieve twice higher correlation and twice smaller RMSE than a Kalman filter.

Original languageEnglish
Title of host publication2015 23rd European Signal Processing Conference (EUSIPCO)
PublisherIEEE
Pages2077-2081
Number of pages5
ISBN (Electronic)9780992862633
DOIs
Publication statusPublished - 28 Dec 2015
Event23rd European Signal Processing Conference 2015 - Nice, France
Duration: 31 Aug 20154 Sept 2015

Publication series

NameEuropean Signal Processing Conference
PublisherIEEE
ISSN (Electronic)2076-1465

Conference

Conference23rd European Signal Processing Conference 2015
Abbreviated titleEUSIPCO 2015
Country/TerritoryFrance
CityNice
Period31/08/154/09/15

Keywords

  • Affect recognition
  • Emotion recognition
  • Gaussian Process state-space model
  • Kalman filter

ASJC Scopus subject areas

  • Media Technology
  • Computer Vision and Pattern Recognition
  • Signal Processing

Fingerprint

Dive into the research topics of 'Dynamic speech emotion recognition with state-space models'. Together they form a unique fingerprint.

Cite this