This paper presents investigations into the relative effectiveness of two alternative approaches to open-set text-independent speaker identification (OSTI-SI). The methods considered are the recently introduced i-vector and the more traditional GMM-UBM method supported by score normalisation. The study is motivated by the growing need for effective extraction of intelligence and evidence from audio recordings in the fight against crime. OSTI-SI is known to be the most challenging subclass of speaker recognition, and its adoption in criminal investigation applications is further complicated by undesired variations in speech characteristics due to changing levels of environmental noise. In this study, the experimental investigations are conducted using a protocol developed for the identification task, based on the NIST speaker recognition evaluation corpus of 2008. In order to closely cover relevant conditions in the considered application areas and investigate the identification performance in such scenarios, the speech data is contaminated with a range of real-world noise. The paper provides a detailed description of the experimental study and presents a thorough analysis of the results.
|Number of pages||6|
|Publication status||Published - Oct 2014|
|Event||48th IEEE International Carnahan Conference on Security Technology - Rome, Italy|
Duration: 13 Oct 2014 → 16 Oct 2014
|Conference||48th IEEE International Carnahan Conference on Security Technology|
|Period||13/10/14 → 16/10/14|
- Open-set speaker identification