TY - JOUR
T1 - Speaker verification under mismatched data conditions
AU - Pillay, S.G.
AU - Ariyaeeinia, A.
AU - Pawlewski, M.
AU - Sivakumaran, P.
N1 - "This paper is a postprint of a paper submitted to and accepted for publication in IET Signal Processing and is subject to Institution of Engineering and Technology Copyright. The copy of record is available at IET Digital Library." [Full text of this article is not available in the UHRA]
PY - 2009
Y1 - 2009
N2 - This study presents investigations into the effectiveness of the state-of-the-art speaker verification techniques (i.e. GMM-UBM and GMM-SVM) in mismatched noise conditions. Based on experiments using white and real world noise, it is shown that the verification performance offered by these methods is severely affected when the level of degradation in the test material is different from that in the training utterances. To address this problem, a modified realisation of the parallel model combination (PMC) method is introduced and a new form of test normalisation (T-norm), termed condition adjusted T-norm, is proposed. It is experimentally demonstrated that the use of these techniques with GMM-UBM can significantly enhance the accuracy in mismatched noise conditions. Based on the experimental results, it is observed that the resultant relative improvement achieved for GMM-UBM (under the most severe mismatch condition considered) is in excess of 70%. Additionally, it is shown that the improvement in the verification accuracy achieved in this way is higher than that obtainable with the direct use of PMC with GMM-UBM. Moreover, it is found that while the accuracy performance of GMM-SVM can also considerably benefit from the use of these techniques, the extensive computational cost involved in this case severely limits the use of such a combined approach in practice.
AB - This study presents investigations into the effectiveness of the state-of-the-art speaker verification techniques (i.e. GMM-UBM and GMM-SVM) in mismatched noise conditions. Based on experiments using white and real world noise, it is shown that the verification performance offered by these methods is severely affected when the level of degradation in the test material is different from that in the training utterances. To address this problem, a modified realisation of the parallel model combination (PMC) method is introduced and a new form of test normalisation (T-norm), termed condition adjusted T-norm, is proposed. It is experimentally demonstrated that the use of these techniques with GMM-UBM can significantly enhance the accuracy in mismatched noise conditions. Based on the experimental results, it is observed that the resultant relative improvement achieved for GMM-UBM (under the most severe mismatch condition considered) is in excess of 70%. Additionally, it is shown that the improvement in the verification accuracy achieved in this way is higher than that obtainable with the direct use of PMC with GMM-UBM. Moreover, it is found that while the accuracy performance of GMM-SVM can also considerably benefit from the use of these techniques, the extensive computational cost involved in this case severely limits the use of such a combined approach in practice.
KW - speaker verification technique
KW - mismatched noise condition
KW - white noise
KW - parallel model combination
KW - GMM-UBM
KW - GMM-SVM
KW - support vector machine
U2 - 10.1049/iet-spr.2008.0175
DO - 10.1049/iet-spr.2008.0175
M3 - Article
SN - 1751-9675
VL - 3
SP - 236
EP - 246
JO - IET Signal Processing
JF - IET Signal Processing
IS - 4
ER -