Unsupervised Pronunciation Validation

Download Now Free registration required

Executive Summary

This paper addresses selecting between candidate pronunciations for out-of-vocabulary words in speech processing tasks. The authors introduce a simple, unsupervised method that outperforms the conventional supervised method of forced alignment with a reference. The success of this method is independently demonstrated using three metrics from large-scale speech tasks: word error rates for large vocabulary continuous speech recognition, decision error tradeoff curves for spoken term detection, and phone error rates compared to a handcrafted pronunciation lexicon. The experiments were conducted using state-of-the-art recognition, indexing, and retrieval systems. The results were compared across many terms, hundreds of hours of speech, and well known data sets.

  • Format: PDF
  • Size: 189.4 KB