TY - GEN
T1 - Audio feature selection for recognition of non-linguistic vocalization sounds
AU - Theodorou, Theodoros
AU - Mporas, Iosif
AU - Fakotakis, Nikos
PY - 2014/1/1
Y1 - 2014/1/1
N2 - Aiming at automatic detection of non-linguistic sounds from vocalizations, we investigate the applicability of various subsets of audio features, which were formed on the basis of ranking the relevance and the individual quality of several audio features. Specifically, based on the ranking of the large set of audio descriptors, we performed selection of subsets and evaluated them on the non-linguistic sound recognition task. During the audio parameterization process, every input utterance is converted to a single feature vector, which consists of 207 parameters. Next, a subset of this feature vector is fed to a classification model, which aims at straight estimation of the unknown sound class. The experimental evaluation showed that the feature vector composed of the 50-best ranked parameters provides a good trade-off between computational demands and accuracy, and that the best accuracy, in terms of recognition accuracy, is observed for the 150-best subset.
AB - Aiming at automatic detection of non-linguistic sounds from vocalizations, we investigate the applicability of various subsets of audio features, which were formed on the basis of ranking the relevance and the individual quality of several audio features. Specifically, based on the ranking of the large set of audio descriptors, we performed selection of subsets and evaluated them on the non-linguistic sound recognition task. During the audio parameterization process, every input utterance is converted to a single feature vector, which consists of 207 parameters. Next, a subset of this feature vector is fed to a classification model, which aims at straight estimation of the unknown sound class. The experimental evaluation showed that the feature vector composed of the 50-best ranked parameters provides a good trade-off between computational demands and accuracy, and that the best accuracy, in terms of recognition accuracy, is observed for the 150-best subset.
KW - audio features
KW - classification algorithms
KW - Non-linguistic vocalizations
KW - sound recognition
UR - http://www.scopus.com/inward/record.url?scp=84900536617&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-07064-3_32
DO - 10.1007/978-3-319-07064-3_32
M3 - Conference contribution
AN - SCOPUS:84900536617
SN - 9783319070636
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 395
EP - 405
BT - Artificial Intelligence
PB - Springer Nature
T2 - 8th Hellenic Conference on Artificial Intelligence: Methods and Applications, SETN 2014
Y2 - 15 May 2014 through 17 May 2014
ER -