Software

Speaker Information From Subband Energies of Linear Prediction Residual

Free registration required

Executive Summary

The objective of this paper is to describe the significant speaker information present in the subband energies of the Linear Prediction (LP) residual. The LP residual mostly contains the excitation source information. The subband energies extracted using the mel filterbank followed by cepstral analysis provides a compact representation. The resulting cepstral values are termed as Residual-Mel Frequency Cepstral Coefficients (RMFCC). The speaker identification studies conducted using RMFCC as features and Gaussian Mixture Model (GMM) on a subset of 30 speakers from NIST-1999 provides 87% accuracy. The performance using MFCC extracted directly from speech provides 87% accuracy.

  • Format: PDF
  • Size: 78.53 KB