TY - JOUR
T1 - Deep multi-level feature pyramids
T2 - Application for non-canonical firearm detection in video surveillance
AU - Lim, Jun Yi
AU - Al Jobayer, Md Istiaque
AU - Baskaran, Vishnu Monn
AU - Lim, Joanne Mun Yee
AU - See, John
AU - Wong, Kok Sheik
N1 - Funding Information:
This work was funded by the Malaysian Ministry of Education’s Fundamental Research Grant Scheme (Grant Number: FRGS/1/2018/ICT02/MUSM/02/3 ) and the School of Information Technology’s AI Compute Grant Scheme under the purview of Monash University Malaysia.
Publisher Copyright:
© 2020 Elsevier Ltd
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2021/1
Y1 - 2021/1
N2 - The epidemic of gun violence worldwide necessitates the need for an active-based video surveillance network to combat this crime. In this context, autonomously detecting handguns is crucial in capturing firearm-related crimes. However, current object detectors using deep learning are unable to capture handguns at different scales in an unconstrained environment. Hence, this paper puts forward an enhanced deep multi-level feature pyramid network that addresses the difficulty in inferring handguns from a non-canonical perspective. We first construct a dataset containing handguns in an unconstrained environment for representation learning. The dataset is constructed from a set of 250 recorded videos and with over 2500 distinct labeled frames. Crucially, these labeled frames account for the absence of a proper video surveillance-based handgun dataset. We then train the dataset on a multi-level multi-scale object detector, i.e., M2Det. We further improve the performance of M2Det by: (1) Enhancing the base features by concatenating shallow, medium and deep features from the backbone according to its relative receptive field; (2) Implementing generalized intersection-over-union as its localization loss; and (3) Integrating Focal Loss as its classification loss to improve detection of small-scale handguns. Experiments on a challenging video surveillance test dataset demonstrate that the proposed model achieves 87.42% accuracy. In addition, we implement adaptive surveillance image partitioning to redetect handguns at specific regions. This method potentially solves the challenge of sporadically poor real-world handgun classifications. This model is capable of pioneering non-canonical handgun detection for active-based video surveillance systems. The dataset and trained models are available at:https://github.com/MarcusLimJunYi/Monash-Guns-Dataset.
AB - The epidemic of gun violence worldwide necessitates the need for an active-based video surveillance network to combat this crime. In this context, autonomously detecting handguns is crucial in capturing firearm-related crimes. However, current object detectors using deep learning are unable to capture handguns at different scales in an unconstrained environment. Hence, this paper puts forward an enhanced deep multi-level feature pyramid network that addresses the difficulty in inferring handguns from a non-canonical perspective. We first construct a dataset containing handguns in an unconstrained environment for representation learning. The dataset is constructed from a set of 250 recorded videos and with over 2500 distinct labeled frames. Crucially, these labeled frames account for the absence of a proper video surveillance-based handgun dataset. We then train the dataset on a multi-level multi-scale object detector, i.e., M2Det. We further improve the performance of M2Det by: (1) Enhancing the base features by concatenating shallow, medium and deep features from the backbone according to its relative receptive field; (2) Implementing generalized intersection-over-union as its localization loss; and (3) Integrating Focal Loss as its classification loss to improve detection of small-scale handguns. Experiments on a challenging video surveillance test dataset demonstrate that the proposed model achieves 87.42% accuracy. In addition, we implement adaptive surveillance image partitioning to redetect handguns at specific regions. This method potentially solves the challenge of sporadically poor real-world handgun classifications. This model is capable of pioneering non-canonical handgun detection for active-based video surveillance systems. The dataset and trained models are available at:https://github.com/MarcusLimJunYi/Monash-Guns-Dataset.
KW - Active video surveillance
KW - deep neural network
KW - multi-level feature pyramids
KW - non-canonical firearm detection
UR - http://www.scopus.com/inward/record.url?scp=85096688863&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2020.104094
DO - 10.1016/j.engappai.2020.104094
M3 - Article
AN - SCOPUS:85096688863
SN - 0952-1976
VL - 97
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 104094
ER -