Synthesising natural-sounding vowels using a nonlinear dynamical model

Iain Mann, Steve McLaughlin

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

This paper addresses the issue of vowel sound synthesis using a nonlinear model, comprising of a free-running radial basis function (RBF) neural network with global feedback. Voiced speech production is modelled as the output of a nonlinear dynamical system, rather than the conventional linear source-filter approach, which, given the nonlinear nature of speech, is expected to produce more natural-sounding synthetic speech. It is shown that the use of regularisation theory when learning the weights allows stable resynthesis when the network is operated with a global feedback and no external input, correctly producing the desired vowel sound. Additionally it is found that the dynamics of the vowel sound are well modelled, including the inter-pitch variations (jitter), thus making the synthesised vowel more natural-sounding than is possible with simple linear techniques. (C) 2001 Elsevier Science B.V. All rights reserved.

Original languageEnglish
Pages (from-to)1743-1756
Number of pages14
JournalSignal Processing
Volume81
Issue number8
DOIs
Publication statusPublished - Aug 2001

Cite this