TY - JOUR
T1 - A Deep Learning-Based Method for Crowd Counting Using Shunting Inhibition Mechanism
AU - Tivive, Fok Hing Chi
AU - Bouzerdoum, Abdesselam
AU - Phung, Son Lam
AU - Le, Hoang Thanh
AU - Baali, Hamza
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/8/14
Y1 - 2024/8/14
N2 - Image-based crowd counting has gained significant attention due to its widespread applications in security and surveillance. Recent advancements in deep learning have led to the development of numerous methods that have achieved remarkable success in accurately counting crowds. However, many of the existing deep learning methods, which have large model sizes, are unsuitable for deployment on edge devices. This article introduces a novel network architecture and processing element designed to create an efficient and compact deep learning model for crowd counting. The processing element, referred to as the shunting inhibitory neuron, generates complex decision boundaries, making it more powerful than the traditional perceptron. It is employed in both the encoder and decoder modules of the proposed model for feature extraction. Furthermore, the decoder includes alternating convolutional and transformer layers, which provide local receptive fields and global self-attention, respectively. This design captures rich contextual information that is used for generating accurate segmentation and density maps. The self-attention mechanism is implemented using convolution modulation instead of matrix multiplication to reduce computational costs. Experiments conducted on three challenging crowd counting datasets demonstrate that the proposed deep learning network, which comprises a small model size, achieves crowd counting performance comparable to that of state-of-the-art techniques. Codes are available at https://github.com/ftivive/SINet.
AB - Image-based crowd counting has gained significant attention due to its widespread applications in security and surveillance. Recent advancements in deep learning have led to the development of numerous methods that have achieved remarkable success in accurately counting crowds. However, many of the existing deep learning methods, which have large model sizes, are unsuitable for deployment on edge devices. This article introduces a novel network architecture and processing element designed to create an efficient and compact deep learning model for crowd counting. The processing element, referred to as the shunting inhibitory neuron, generates complex decision boundaries, making it more powerful than the traditional perceptron. It is employed in both the encoder and decoder modules of the proposed model for feature extraction. Furthermore, the decoder includes alternating convolutional and transformer layers, which provide local receptive fields and global self-attention, respectively. This design captures rich contextual information that is used for generating accurate segmentation and density maps. The self-attention mechanism is implemented using convolution modulation instead of matrix multiplication to reduce computational costs. Experiments conducted on three challenging crowd counting datasets demonstrate that the proposed deep learning network, which comprises a small model size, achieves crowd counting performance comparable to that of state-of-the-art techniques. Codes are available at https://github.com/ftivive/SINet.
KW - Crowd counting
KW - density map estimation
KW - shunting inhibition
UR - http://www.scopus.com/inward/record.url?scp=85201297922&partnerID=8YFLogxK
U2 - 10.1109/TAI.2024.3443789
DO - 10.1109/TAI.2024.3443789
M3 - Article
AN - SCOPUS:85201297922
SN - 2691-4581
VL - 5
SP - 5733
EP - 5745
JO - IEEE Transactions on Artificial Intelligence
JF - IEEE Transactions on Artificial Intelligence
IS - 11
ER -