Abstract
This paper addresses the issue of vowel sound synthesis using a nonlinear model, comprising of a free-running radial basis function (RBF) neural network with global feedback. Voiced speech production is modelled as the output of a nonlinear dynamical system, rather than the conventional linear source-filter approach, which, given the nonlinear nature of speech, is expected to produce more natural-sounding synthetic speech. It is shown that the use of regularisation theory when learning the weights allows stable resynthesis when the network is operated with a global feedback and no external input, correctly producing the desired vowel sound. Additionally it is found that the dynamics of the vowel sound are well modelled, including the inter-pitch variations (jitter), thus making the synthesised vowel more natural-sounding than is possible with simple linear techniques. (C) 2001 Elsevier Science B.V. All rights reserved.
Original language | English |
---|---|
Pages (from-to) | 1743-1756 |
Number of pages | 14 |
Journal | Signal Processing |
Volume | 81 |
Issue number | 8 |
DOIs | |
Publication status | Published - Aug 2001 |