Abstract
Feature selection is a popular data pre-processing step. The aim is to remove some of the features in a data set with minimum information loss, leading to a number of benefits including faster running time and easier data visualisation.
In this paper we introduce two unsupervised feature selection algorithms. These make use of a cluster-dependent feature-weighting mechanism reflecting the within-cluster degree of relevance of a given feature. Those features with
a relatively low weight are removed from the data set. We compare our algorithms to two other popular alternatives using a number of experiments on both synthetic and real-world data sets, with and without added noisy features.
These experiments demonstrate our algorithms clearly outperform the alternatives.
In this paper we introduce two unsupervised feature selection algorithms. These make use of a cluster-dependent feature-weighting mechanism reflecting the within-cluster degree of relevance of a given feature. Those features with
a relatively low weight are removed from the data set. We compare our algorithms to two other popular alternatives using a number of experiments on both synthetic and real-world data sets, with and without added noisy features.
These experiments demonstrate our algorithms clearly outperform the alternatives.
Original language | English |
---|---|
Pages (from-to) | 44-52 |
Number of pages | 9 |
Journal | Information Processing Letters |
Volume | 129 |
Early online date | 21 Sept 2017 |
DOIs | |
Publication status | Published - 1 Jan 2018 |
Keywords
- Algorithms
- Clustering
- Feature selection