Speech Synthesis Using Artificial Neural Networks
Parametric speech synthesizers in early 80's, also referred to as synthesis-by-rule, were built using careful selection of parameters and a set of rules for manipulation of parameters. Statistical Parametric Synthesis (SPS) uses machine learning algorithms to learn the parameters from the features extracted from the speech signal. HTS [2, 3] and CLUSTERGEN  are statistical parametric synthesis engines using hidden Markov models and Classication and Regression Trees (CART) respectively to learn the parameters from the speech data. In SPS framework, spectral features are often represented by Mel-Log spectral approximation based cepstral coefficients, line spectral pairs and harmonic noise models features.