University of Hertfordshire

From the same journal

Feature Relevance in Ward’s Hierarchical Clustering Using the Lp Norm

Research output: Contribution to journalArticle

Documents

  • MW_Ward

    Accepted author manuscript, 277 KB, PDF document

  • Renato Cordeiro De Amorim
View graph of relations
Original languageEnglish
Pages (from-to)46-62
JournalJournal of Classification
Volume32
Issue1
Early online date11 Mar 2015
DOIs
Publication statusPublished - Apr 2015

Abstract

In this paper we introduce a new hierarchical clustering algorithm called Ward p . Unlike the original Ward, Ward p generates feature weights, which can be seen as feature rescaling factors thanks to the use of the L p norm. The feature weights are cluster dependent, allowing a feature to have different degrees of relevance at different clusters.
We validate our method by performing experiments on a total of 75 real-world and synthetic datasets, with and without added features made of uniformly random noise. Our experiments show that: (i) the use of our feature weighting method produces results that are superior to those produced by the original Ward method on datasets containing noise features; (ii) it is indeed possible to estimate a good exponent p under a totally unsupervised framework. The clusterings produced by Ward p are dependent on p. This makes the estimation of a good value for this exponent a requirement for this algorithm, and indeed for any other also based on the Lp norm.

ID: 9822939