TY - JOUR
T1 - Deciphering the impact of diversity in CNN-based ensembles on overcoming data imbalance and scarcity in medical datasets
T2 - A case study on diabetic retinopathy
AU - Inamullah,
AU - Hassan, Saima
AU - Belhaouari, Samir Brahim
AU - Amin, Ibrar
N1 - Publisher Copyright:
© 2024 The Authors
PY - 2024/1
Y1 - 2024/1
N2 - Early detection of diabetic retinopathy (DR) is critical in preventing vision loss. However, building accurate Artificial intelligence (AI) models for multiple classes, including early-stage (Class-1) detection, is challenging due to limited and imbalanced medical datasets. The availability of such datasets is restricted due to ethical and privacy concerns. Traditional ensemble models also struggle with raw medical images, further complicating the issue as they require structured data. This study presents a novel deep learning-based ensemble model (EM) designed for multiple and specifically for precise early stage (Class 1) DR classification. The EM uses eight diverse Convolutional Neural Networks (CNNs) with carefully crafted strategies to enhance diversity. Data augmentation and generation techniques address imbalanced data through data diversity, while parameter and architectural diver-sity within CNNs-based EM maximize predictive performance. Evaluation on the publicly available Kaggle APTOS DR dataset demonstrates significant improvement over individual models and existing approaches. The proposed EM achieves multi-class accuracy (93.00 %), precision (93.00 %), sensitivity (98.00 %), and specificity (99.00 %). This research highlights the effectiveness of diversified CNNs ensembles in overcoming challenges posed by imbalanced and scarce data for multiple-class DR classification. This approach paves the way for developing robust and accurate AI-powered diagnostic tools for improved diabetic retinopathy screening.
AB - Early detection of diabetic retinopathy (DR) is critical in preventing vision loss. However, building accurate Artificial intelligence (AI) models for multiple classes, including early-stage (Class-1) detection, is challenging due to limited and imbalanced medical datasets. The availability of such datasets is restricted due to ethical and privacy concerns. Traditional ensemble models also struggle with raw medical images, further complicating the issue as they require structured data. This study presents a novel deep learning-based ensemble model (EM) designed for multiple and specifically for precise early stage (Class 1) DR classification. The EM uses eight diverse Convolutional Neural Networks (CNNs) with carefully crafted strategies to enhance diversity. Data augmentation and generation techniques address imbalanced data through data diversity, while parameter and architectural diver-sity within CNNs-based EM maximize predictive performance. Evaluation on the publicly available Kaggle APTOS DR dataset demonstrates significant improvement over individual models and existing approaches. The proposed EM achieves multi-class accuracy (93.00 %), precision (93.00 %), sensitivity (98.00 %), and specificity (99.00 %). This research highlights the effectiveness of diversified CNNs ensembles in overcoming challenges posed by imbalanced and scarce data for multiple-class DR classification. This approach paves the way for developing robust and accurate AI-powered diagnostic tools for improved diabetic retinopathy screening.
KW - Convolutional neural networks
KW - Deep learning
KW - Diabetic retinopathy
KW - Ensemble diversity
KW - Ensemble model
KW - Machine learning
KW - Medical image analysis
KW - Retinal images
UR - http://www.scopus.com/inward/record.url?scp=85199283285&partnerID=8YFLogxK
U2 - 10.1016/j.imu.2024.101557
DO - 10.1016/j.imu.2024.101557
M3 - Article
AN - SCOPUS:85199283285
SN - 2352-9148
VL - 49
JO - Informatics in Medicine Unlocked
JF - Informatics in Medicine Unlocked
M1 - 101557
ER -