Applying subclustering and Lp distance in Weighted K-Means with distributed centroids

Renato Cordeiro De Amorim, Vladimir Makarenkov

    Research output: Contribution to journalArticlepeer-review

    18 Citations (Scopus)
    126 Downloads (Pure)

    Abstract

    We consider the Weighted K-Means algorithm with distributed centroids aimed at clustering data sets with numerical, categorical and mixed types of data. Our approach allows given features (i.e., variables) to have different weights at different clusters. Thus, it supports the intuitive idea that features may have different degrees of relevance at different clusters. We use the Minkowski metric in a way that feature weights become feature re-scaling factors for any considered exponent. Moreover, the traditional Silhouette clustering validity index was adapted to deal with both numerical and categorical types of features. Finally, we show that our new method usually outperforms traditional K-Means as well as the recently proposed WK-DC clustering algorithm.
    Original languageEnglish
    Pages (from-to)700-707
    JournalNeurocomputing
    Volume173
    Issue number3
    Early online date17 Aug 2015
    DOIs
    Publication statusPublished - 15 Jan 2016

    Fingerprint

    Dive into the research topics of 'Applying subclustering and Lp distance in Weighted K-Means with distributed centroids'. Together they form a unique fingerprint.

    Cite this