Adding personality to neutral speech synthesis voices

Christopher G. Buchanan, Matthew P. Aylett, David A. Braude

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

A synthetic voice personifies the system using it. Previous work has shown that using sub-corpora with different voice qualities (e.g. tense and lax) can be used to modify the perceived personality of a voice as well as adding expressive and emotional functionality. In this work we explore the use of LPC source/filter decomposition together with modification of the residual to artificially add voice quality sub-corpora to a voice without recording bespoke data. We evaluate this artificially enhanced voice against a baseline unit selection voice with pre-recorded sub-corpora. Although artificial modification impacts naturalness, it has the advantage of adding emotional range to voices where none was recorded in the source data, deals with data sparsity issues caused by sub-corpora, and results in significant effects in terms of perceived emotion.
Original languageEnglish
Title of host publicationSpeech and Computer. SPECOM 2018
PublisherSpringer
Pages49-57
Number of pages9
ISBN (Electronic)978331999579-3
ISBN (Print)9783319995786
DOIs
Publication statusPublished - 25 Aug 2018
Event20th International Conference on Speech and Computer 2018 - Leipzig, Germany
Duration: 18 Sept 201822 Sept 2018

Publication series

NameLecture Notes in Computer Science
Volume11096
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th International Conference on Speech and Computer 2018
Abbreviated titleSPECOM 2018
Country/TerritoryGermany
CityLeipzig
Period18/09/1822/09/18

Fingerprint

Dive into the research topics of 'Adding personality to neutral speech synthesis voices'. Together they form a unique fingerprint.

Cite this