TY - JOUR
T1 - Explainable event recognition
AU - Khan, Imran
AU - Ahmad, Kashif
AU - Gul, Namra
AU - Khan, Talhat
AU - Ahmad, Nasir
AU - Al-Fuqaha, Ala
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023/11
Y1 - 2023/11
N2 - The literature shows outstanding capabilities for Convolutional Neural Networks (CNNs) in event recognition in images. However, fewer attempts are made to analyze the potential causes behind the decisions of the models and explore whether the predictions are based on event-salient objects/regions? To explore this important aspect of event recognition, in this work, we propose an explainable event recognition framework relying on Grad-CAM and an Xception architecture-based CNN model. Experiments are conducted on four large-scale datasets covering a diversified set of natural disasters, social, and sports events. Overall, the model showed outstanding generalization capabilities obtaining overall F1 scores of 0.91, 0.94, and 0.97 on natural disasters, social, and sports events, respectively. Moreover, for subjective analysis of activation maps generated through Grad-CAM for the predicted samples of the model, a crowd-sourcing study is conducted to analyze whether the model’s predictions are based on event-related objects/regions or not? The results of the study indicate that 78%, 84%, and 78% of the model decisions on natural disasters, sports, and social events datasets, respectively, are based on event-related objects/regions.
AB - The literature shows outstanding capabilities for Convolutional Neural Networks (CNNs) in event recognition in images. However, fewer attempts are made to analyze the potential causes behind the decisions of the models and explore whether the predictions are based on event-salient objects/regions? To explore this important aspect of event recognition, in this work, we propose an explainable event recognition framework relying on Grad-CAM and an Xception architecture-based CNN model. Experiments are conducted on four large-scale datasets covering a diversified set of natural disasters, social, and sports events. Overall, the model showed outstanding generalization capabilities obtaining overall F1 scores of 0.91, 0.94, and 0.97 on natural disasters, social, and sports events, respectively. Moreover, for subjective analysis of activation maps generated through Grad-CAM for the predicted samples of the model, a crowd-sourcing study is conducted to analyze whether the model’s predictions are based on event-related objects/regions or not? The results of the study indicate that 78%, 84%, and 78% of the model decisions on natural disasters, sports, and social events datasets, respectively, are based on event-related objects/regions.
KW - Convolutional neural networks
KW - Event recognition
KW - Explainability
KW - Grad-CAM
KW - Interpretation
KW - Multimedia indexing and retrieval
KW - Natural disasters
KW - Social events
KW - Sports events
UR - http://www.scopus.com/inward/record.url?scp=85151426251&partnerID=8YFLogxK
U2 - 10.1007/s11042-023-14832-0
DO - 10.1007/s11042-023-14832-0
M3 - Article
AN - SCOPUS:85151426251
SN - 1380-7501
VL - 82
SP - 40531
EP - 40557
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 26
ER -