CO-LDA: A Semi-Supervised Approach to Audio-Visual Person Recognition
Client models used in Automatic Speaker Recognition (ASR) and Automatic Face Recognition (AFR) are usually trained with labelled data acquired in a small number of enrolment sessions. The amount of training data is rarely sufficient to reliably represent the variation which occurs later during testing. Larger quantities of client-specific training data can always be obtained, but manual collection and labelling is often cost-prohibitive. Co-training, a paradigm of semi-supervised machine learning, which can exploit unlabelled data to enhance weakly learned client models. In this paper, the authors propose a co-LDA algorithm which uses both labelled and unlabelled data to capture greater intersession variation and to learn discriminative subspaces in which test examples can be more accurately classified.