Software

Speech Synthesis Using Artificial Neural Networks

Free registration required

Executive Summary

Parametric speech synthesizers in early 80's, also referred to as synthesis-by-rule, were built using careful selection of parameters and a set of rules for manipulation of parameters. Statistical Parametric Synthesis (SPS) uses machine learning algorithms to learn the parameters from the features extracted from the speech signal. HTS [2, 3] and CLUSTERGEN [4] are statistical parametric synthesis engines using hidden Markov models and Classication and Regression Trees (CART) respectively to learn the parameters from the speech data. In SPS framework, spectral features are often represented by Mel-Log spectral approximation based cepstral coefficients, line spectral pairs and harmonic noise models features.

  • Format: PDF
  • Size: 237 KB