University of Hertfordshire

A clustering based approach to reduce feature redundancy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Standard

A clustering based approach to reduce feature redundancy. / Cordeiro De Amorim, Renato; Mirkin, Boris.

Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions. Vol. 364 Springer, 2016. p. 465-475 (Advances in Intelligent Systems and Computing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Harvard

Cordeiro De Amorim, R & Mirkin, B 2016, A clustering based approach to reduce feature redundancy. in Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions. vol. 364, Advances in Intelligent Systems and Computing, Springer, pp. 465-475, Knowledge, Information and Creativity Support Systems, Krakow, Poland, 7/11/13. https://doi.org/10.1007/978-3-319-19090-7

APA

Cordeiro De Amorim, R., & Mirkin, B. (2016). A clustering based approach to reduce feature redundancy. In Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions (Vol. 364, pp. 465-475). (Advances in Intelligent Systems and Computing). Springer. https://doi.org/10.1007/978-3-319-19090-7

Vancouver

Cordeiro De Amorim R, Mirkin B. A clustering based approach to reduce feature redundancy. In Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions. Vol. 364. Springer. 2016. p. 465-475. (Advances in Intelligent Systems and Computing). https://doi.org/10.1007/978-3-319-19090-7

Author

Cordeiro De Amorim, Renato ; Mirkin, Boris. / A clustering based approach to reduce feature redundancy. Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions. Vol. 364 Springer, 2016. pp. 465-475 (Advances in Intelligent Systems and Computing).

Bibtex

@inproceedings{cb63c4f9896a4dd2bf3c3cf66a459a6b,
title = "A clustering based approach to reduce feature redundancy",
abstract = "Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.",
keywords = "unsupervised feature selection, feature weighting, redundant features, clustering, mental task, separation",
author = "{Cordeiro De Amorim}, Renato and Boris Mirkin",
note = "This document is the Accepted Manuscript version of the following paper: Cordeiro de Amorim, R.,and Mirkin, B., {\textquoteleft}A clustering based approach to reduce feature redundancy{\textquoteright}, in Proceedings, Andrzej M. J. Skulimowski and Janusz Kacprzyk, eds., Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions, Selected papers from KICSS{\textquoteright}2013 - 8th International Conference on Knowledge, Information, and Creativity Support Systems, Krak{\'o}w, Poland, 7-9 November 2013. ISBN 978-3-319-19089-1, e-ISBN 978-3-319-19090-7. Available online at doi: 10.1007/978-3-319-19090-7. {\textcopyright} Springer International Publishing Switzerland 2016.; Knowledge, Information and Creativity Support Systems : Recent Trends, Advances and Solutions, KICSS'2013 ; Conference date: 07-11-2013 Through 09-11-2013",
year = "2016",
month = feb,
day = "26",
doi = "10.1007/978-3-319-19090-7",
language = "English",
isbn = "978-3-319-19089-1",
volume = "364",
series = "Advances in Intelligent Systems and Computing",
publisher = "Springer",
pages = "465--475",
booktitle = "Knowledge, Information and Creativity Support Systems",
url = "http://www.kicss2013.ipbf.eu/",

}

RIS

TY - GEN

T1 - A clustering based approach to reduce feature redundancy

AU - Cordeiro De Amorim, Renato

AU - Mirkin, Boris

N1 - This document is the Accepted Manuscript version of the following paper: Cordeiro de Amorim, R.,and Mirkin, B., ‘A clustering based approach to reduce feature redundancy’, in Proceedings, Andrzej M. J. Skulimowski and Janusz Kacprzyk, eds., Knowledge, Information and Creativity Support Systems: Recent Trends, Advances and Solutions, Selected papers from KICSS’2013 - 8th International Conference on Knowledge, Information, and Creativity Support Systems, Kraków, Poland, 7-9 November 2013. ISBN 978-3-319-19089-1, e-ISBN 978-3-319-19090-7. Available online at doi: 10.1007/978-3-319-19090-7. © Springer International Publishing Switzerland 2016.

PY - 2016/2/26

Y1 - 2016/2/26

N2 - Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.

AB - Research effort has recently focused on designing feature weighting clustering algorithms. These algorithms automatically calculate the weight of each feature, representing their degree of relevance, in a data set. However, since most of these evaluate one feature at a time they may have difficulties to cluster data sets containing features with similar information. If a group of features contain the same relevant information, these clustering algorithms set high weights to each feature in this group, instead of removing some because of their redundant nature. This paper introduces an unsupervised feature selection method that can be used in the data pre-processing step to reduce the number of redundant features in a data set. This method clusters similar features together and then selects a subset of representative features for each cluster. This selection is based on the maximum information compression index between each feature and its respective cluster centroid. We present an empirical validation for our method by comparing it with a popular unsupervised feature selection on three EEG data sets. We find that our method selects features that produce better cluster recovery, without the need for an extra user-defined parameter.

KW - unsupervised feature selection

KW - feature weighting

KW - redundant features

KW - clustering

KW - mental task

KW - separation

U2 - 10.1007/978-3-319-19090-7

DO - 10.1007/978-3-319-19090-7

M3 - Conference contribution

SN - 978-3-319-19089-1

VL - 364

T3 - Advances in Intelligent Systems and Computing

SP - 465

EP - 475

BT - Knowledge, Information and Creativity Support Systems

PB - Springer

T2 - Knowledge, Information and Creativity Support Systems

Y2 - 7 November 2013 through 9 November 2013

ER -