TY - GEN
T1 - Toward Perception-Based Evaluation of Clustering Techniques for Visual Analytics
AU - Aupetit, Michael
AU - Sedlmair, Michael
AU - Abbas, Mostafa M.
AU - Baggag, Abdelkader
AU - Bensmail, Halima
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Automatic clustering techniques play a central role in Visual Analytics by helping analysts to discover interesting patterns in high-dimensional data. Evaluating these clustering techniques, however, is difficult due to the lack of universal ground truth. Instead, clustering approaches are usually evaluated based on a subjective visual judgment of low-dimensional scatterplots of different datasets. As clustering is an inherent human-in-the-loop task, we propose a more systematic way of evaluating clustering algorithms based on quantification of human perception of clusters in 2D scatterplots. The core question we are asking is in how far existing clustering techniques align with clusters perceived by humans. To do so, we build on a dataset from a previous study [1], in which 34 human subjects la-beled 1000 synthetic scatterplots in terms of whether they could see one or more than one cluster. Here, we use this dataset to benchmark state-of-the-art clustering techniques in terms of how far they agree with these human judgments. More specifically, we assess 1437 variants of K-means, Gaussian Mixture Models, CLIQUE, DBSCAN, and Agglomerative Clustering techniques on these benchmarks data. We get unexpected results. For instance, CLIQUE and DBSCAN are at best in slight agreement on this basic cluster counting task, while model-agnostic Agglomerative clustering can be up to a substantial agreement with human subjects depending on the variants. We discuss how to extend this perception-based clustering benchmark approach, and how it could lead to the design of perception-based clustering techniques that would better support more trustworthy and explainable models of cluster patterns.
AB - Automatic clustering techniques play a central role in Visual Analytics by helping analysts to discover interesting patterns in high-dimensional data. Evaluating these clustering techniques, however, is difficult due to the lack of universal ground truth. Instead, clustering approaches are usually evaluated based on a subjective visual judgment of low-dimensional scatterplots of different datasets. As clustering is an inherent human-in-the-loop task, we propose a more systematic way of evaluating clustering algorithms based on quantification of human perception of clusters in 2D scatterplots. The core question we are asking is in how far existing clustering techniques align with clusters perceived by humans. To do so, we build on a dataset from a previous study [1], in which 34 human subjects la-beled 1000 synthetic scatterplots in terms of whether they could see one or more than one cluster. Here, we use this dataset to benchmark state-of-the-art clustering techniques in terms of how far they agree with these human judgments. More specifically, we assess 1437 variants of K-means, Gaussian Mixture Models, CLIQUE, DBSCAN, and Agglomerative Clustering techniques on these benchmarks data. We get unexpected results. For instance, CLIQUE and DBSCAN are at best in slight agreement on this basic cluster counting task, while model-agnostic Agglomerative clustering can be up to a substantial agreement with human subjects depending on the variants. We discuss how to extend this perception-based clustering benchmark approach, and how it could lead to the design of perception-based clustering techniques that would better support more trustworthy and explainable models of cluster patterns.
KW - H.1.2 [User/Machine Systems]: Human factors
KW - I.5.2 [Design Methodology]: Pattern analysis
KW - I.5.3 [Clustering]: Algorithms
UR - http://www.scopus.com/inward/record.url?scp=85078029401&partnerID=8YFLogxK
U2 - 10.1109/VISUAL.2019.8933620
DO - 10.1109/VISUAL.2019.8933620
M3 - Conference contribution
AN - SCOPUS:85078029401
T3 - 2019 IEEE Visualization Conference, VIS 2019
SP - 141
EP - 145
BT - 2019 IEEE Visualization Conference, VIS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Visualization Conference, VIS 2019
Y2 - 20 October 2019 through 25 October 2019
ER -