Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech

John Kane, Stefan Scherer, Matthew Aylett, Louis-Philippe Morency, Christer Gobl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

Voice quality plays a pivotal role in speech style variation. Therefore, control and analysis of voice quality is critical for many areas of speech technology. Until now, most work has focused on small purpose built corpora. In this paper we apply state-of-the-art voice quality analysis to large speech corpora built for expressive speech synthesis. A fuzzy-input fuzzy-output support vector machine classifier is trained and validated using features extracted from these corpora. We then apply this classifier to freely available audiobook data and demonstrate a clustering of the voice qualities that approximates the performance of human perceptual ratings. The ability to detect voice quality variation in these widely available unlabelled audiobook corpora means that the proposed method may be used as a valuable resource in expressive speech synthesis.
Original languageEnglish
Title of host publication2013 IEEE International Conference on Acoustics, Speech and Signal Processing
PublisherIEEE
Pages7982-7986
Number of pages5
ISBN (Electronic)9781479903566
DOIs
Publication statusPublished - 21 Oct 2013
Event38th IEEE International Conference on Acoustics, Speech and Signal Processing 2013 - Vancouver, Canada
Duration: 26 May 201331 May 2013

Conference

Conference38th IEEE International Conference on Acoustics, Speech and Signal Processing 2013
Abbreviated titleICASSP 2013
Country/TerritoryCanada
CityVancouver
Period26/05/1331/05/13

Fingerprint

Dive into the research topics of 'Speaker and language independent voice quality classification applied to unlabelled corpora of expressive speech'. Together they form a unique fingerprint.

Cite this