A Nonparametric Split and Kernel-Merge Clustering Algorithm

Khurram Khan, Atiq ur Rehman*, Adnan Khan, Syed Rameez Naqvi, Samir Brahim Belhaouari, Amine Bermak

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This work proposes a novel split and kernel-merge clustering (S-KMC), a nonparametric clustering algorithm that combines the strengths of hierarchical clustering, partitional clustering, and density-based clustering. It consists of two main phases: splitting and merging. In the splitting phase, a ranking-based operator is used to divide the data into optimal subclusters. In the merging phase, a kernel function estimates the density of these subclusters after projecting them onto a straight line passing through their centers, facilitating the merging operation. S-KMC is fully nonparametric, eliminating the need for prior information about the data. It effectively handles 1) shape diversity, 2) density variability, 3) high dimensionality, 4) outliers, and 5) missing values. The algorithm offers easily tunable hyperparameters, enhancing its applicability to complex problems and robustness against data anomalies. Experimental analysis on 21 benchmark datasets demonstrates the improved performance of S-KMC in terms of cluster accuracy, handling high-dimensional data, and managing data anomalies and outliers. Comprehensive comparisons with state-of-the-art techniques further validate the superior or comparable performance of the proposed S-KMC algorithm.

Original languageEnglish
Pages (from-to)4443-4457
Number of pages15
JournalIEEE Transactions on Artificial Intelligence
Volume5
Issue number9
DOIs
Publication statusPublished - 2024

Keywords

  • Density-based clustering
  • hierarchical clustering
  • kernel density estimation
  • nonparametric approaches
  • partitional clustering

Fingerprint

Dive into the research topics of 'A Nonparametric Split and Kernel-Merge Clustering Algorithm'. Together they form a unique fingerprint.

Cite this