TY - JOUR
T1 - Ensemble clustering algorithm with supervised classification of clinical data for early diagnosis of coronary artery disease
AU - Noreen, Kausar
AU - Azween, Abdullah
AU - Belhaouari, Samir Brahim
AU - Sellapan, Palaniappan
AU - Saeed, Alghamdi Bandar
AU - Nilanjan, Dey
N1 - Publisher Copyright:
© 2016 American Scientific Publishers All rights reserved.
PY - 2016/2
Y1 - 2016/2
N2 - Enhancing the detection accuracy of heart anomalies for clinical diagnosis is essential yet complicated because of irrelevant patient's details and slow systematic processing. In this work, the aim is to select relevant clinical features which can accelerate the classification performance to distinguish abnormal and normal patients. For this purpose, Principal Component Analysis (PCA) algorithm is applied to reduce the attribute dimension by incorporating class identifiers for extracting minimal attributes which have maximum portion of the total variance. This approach combines Supervised and Unsupervised learning methods namely Support Vector Machines (SVM) and K-means Clustering for classification by adjusting their related parameters and measures. K-means clustering groups the similar data patterns in possible clusters which are individually classified to determine overall accuracy by computing average of accuracies achieved from all the clusters. Support Vector Machines (SVM) have a better generalization ability which can even detect unseen testing data with model trained at determined parameter values. Results performed on University of California, Irvine (UCI) Cleveland Heart data set have outperformed earlier data mining approaches because of its time, optimized classification by tuning associated parameters and selection of relevant attributes. In future, this approach can be used for multi-classification of different medical datasets.
AB - Enhancing the detection accuracy of heart anomalies for clinical diagnosis is essential yet complicated because of irrelevant patient's details and slow systematic processing. In this work, the aim is to select relevant clinical features which can accelerate the classification performance to distinguish abnormal and normal patients. For this purpose, Principal Component Analysis (PCA) algorithm is applied to reduce the attribute dimension by incorporating class identifiers for extracting minimal attributes which have maximum portion of the total variance. This approach combines Supervised and Unsupervised learning methods namely Support Vector Machines (SVM) and K-means Clustering for classification by adjusting their related parameters and measures. K-means clustering groups the similar data patterns in possible clusters which are individually classified to determine overall accuracy by computing average of accuracies achieved from all the clusters. Support Vector Machines (SVM) have a better generalization ability which can even detect unseen testing data with model trained at determined parameter values. Results performed on University of California, Irvine (UCI) Cleveland Heart data set have outperformed earlier data mining approaches because of its time, optimized classification by tuning associated parameters and selection of relevant attributes. In future, this approach can be used for multi-classification of different medical datasets.
KW - Clustering
KW - Coronary Artery Disease (CAD)
KW - Dimension Reduction
KW - Feature Selection
KW - Principal Component Analysis (PCA)
KW - Support Vector Machines (SVM)
UR - http://www.scopus.com/inward/record.url?scp=84959315945&partnerID=8YFLogxK
U2 - 10.1166/jmihi.2016.1593
DO - 10.1166/jmihi.2016.1593
M3 - Article
AN - SCOPUS:84959315945
SN - 2156-7018
VL - 6
SP - 78
EP - 87
JO - Journal of Medical Imaging and Health Informatics
JF - Journal of Medical Imaging and Health Informatics
IS - 1
ER -