Design and Implementation of a Robot Audition System for Automatic Speech Recognition of Simultaneous Speech
Source: Kyoto University
This paper addresses robot audition that can cope with speech that has a low Signal-to-Noise Ratio (SNR) in real time by using robot-embedded microphones. To cope with such a noise, the authors exploited two key ideas; Preprocessing consisting of sound source localization and separation with a microphone array, and system integration based on Missing Feature Theory (MFT). Preprocessing improves the SNR of a target sound signal using geometric source separation with multichannel post-filter. MFT uses only reliable acoustic features in speech recognition and masks unreliable parts caused by errors in preprocessing.