TY - JOUR
T1 - Feature weighting in DBSCAN using reverse nearest neighbours
AU - Chowdhury, Stiphen
AU - Helian, Na
AU - Cordeiro de Amorim, Renato
N1 - © 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license. https://creativecommons.org/licenses/by/4.0/
PY - 2023/5/1
Y1 - 2023/5/1
N2 - DBSCAN is arguably the most popular density-based clustering algorithm, and it is capable of recovering non-spherical clusters. One of its main weaknesses is that it treats all features equally. In this paper, we propose a density-based clustering algorithm capable of calculating feature weights representing the degree of relevance of each feature, which takes the density structure of the data into account. First, we improve DBSCAN and introduce a new algorithm called DBSCANR. DBSCANR reduces the number of parameters of DBSCAN to one. Then, a new step is introduced to the clustering process of DBSCANR to iteratively update feature weights based on the current partition of data. The feature weights produced by the weighted version of the new clustering algorithm, W-DBSCANR, measure the relevance of variables in a clustering and can be used in feature selection in data mining applications where large and complex real-world data are often involved. Experimental results on both artificial and real-world data have shown that the new algorithms outperformed various DBSCAN type algorithms in recovering clusters in data.
AB - DBSCAN is arguably the most popular density-based clustering algorithm, and it is capable of recovering non-spherical clusters. One of its main weaknesses is that it treats all features equally. In this paper, we propose a density-based clustering algorithm capable of calculating feature weights representing the degree of relevance of each feature, which takes the density structure of the data into account. First, we improve DBSCAN and introduce a new algorithm called DBSCANR. DBSCANR reduces the number of parameters of DBSCAN to one. Then, a new step is introduced to the clustering process of DBSCANR to iteratively update feature weights based on the current partition of data. The feature weights produced by the weighted version of the new clustering algorithm, W-DBSCANR, measure the relevance of variables in a clustering and can be used in feature selection in data mining applications where large and complex real-world data are often involved. Experimental results on both artificial and real-world data have shown that the new algorithms outperformed various DBSCAN type algorithms in recovering clusters in data.
KW - DBSCAN
KW - Density-based clustering
KW - Reverse nearest neighbour
UR - http://www.scopus.com/inward/record.url?scp=85146435515&partnerID=8YFLogxK
U2 - 10.1016/j.patcog.2023.109314
DO - 10.1016/j.patcog.2023.109314
M3 - Article
SN - 0031-3203
VL - 137
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 109314
ER -