TY - JOUR
T1 - Spatio-Temporal Point Process for Multiple Object Tracking
AU - Wang, Tao
AU - Chen, Kean
AU - Lin, Weiyao
AU - See, John
AU - Zhang, Zenghui
AU - Xu, Qian
AU - Jia, Xia
N1 - Funding Information:
This work was supported in part by the China Major Project for New Generation of AI under Grant 2018AAA0100400, in part by the National Natural Science Foundation of China under Grant 61971277, in part by ZTE Industry-Academia-Research Cooperation Funds and State Key Laboratory of Mobile Network and Mobile Multimedia Technology, and in part by CREST Malaysia under Grant T03C1-17.
Publisher Copyright:
© 2012 IEEE.
PY - 2023/4
Y1 - 2023/4
N2 - Multiple object tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. MOT remains a challenging task as noisy and confusing detection results often hinder the final performance. Furthermore, most existing research are focusing on improving detection algorithms and association strategies. As such, we propose a novel framework that can effectively predict and mask-out the noisy and confusing detection results before associating the objects into trajectories. In particular, we formulate such 'bad' detection results as a sequence of events and adopt the spatio-temporal point process to model such events. Traditionally, the occurrence rate in a point process is characterized by an explicitly defined intensity function, which depends on the prior knowledge of some specific tasks. Thus, designing a proper model is expensive and time-consuming, with also limited ability to generalize well. To tackle this problem, we adopt the convolutional recurrent neural network (conv-RNN) to instantiate the point process, where its intensity function is automatically modeled by the training data. Furthermore, we show that our method captures both temporal and spatial evolution, which is essential in modeling events for MOT. Experimental results demonstrate notable improvements in addressing noisy and confusing detection results in MOT data sets. An improved state-of-the-art performance is achieved by incorporating our baseline MOT algorithm with the spatio-temporal point process model.
AB - Multiple object tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. MOT remains a challenging task as noisy and confusing detection results often hinder the final performance. Furthermore, most existing research are focusing on improving detection algorithms and association strategies. As such, we propose a novel framework that can effectively predict and mask-out the noisy and confusing detection results before associating the objects into trajectories. In particular, we formulate such 'bad' detection results as a sequence of events and adopt the spatio-temporal point process to model such events. Traditionally, the occurrence rate in a point process is characterized by an explicitly defined intensity function, which depends on the prior knowledge of some specific tasks. Thus, designing a proper model is expensive and time-consuming, with also limited ability to generalize well. To tackle this problem, we adopt the convolutional recurrent neural network (conv-RNN) to instantiate the point process, where its intensity function is automatically modeled by the training data. Furthermore, we show that our method captures both temporal and spatial evolution, which is essential in modeling events for MOT. Experimental results demonstrate notable improvements in addressing noisy and confusing detection results in MOT data sets. An improved state-of-the-art performance is achieved by incorporating our baseline MOT algorithm with the spatio-temporal point process model.
KW - Multiple object tracking
KW - recurrent neural networks
KW - spatio-temporal point processes
UR - http://www.scopus.com/inward/record.url?scp=85152182685&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2020.2997006
DO - 10.1109/TNNLS.2020.2997006
M3 - Article
C2 - 32511094
AN - SCOPUS:85152182685
SN - 2162-237X
VL - 34
SP - 1777
EP - 1788
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 4
ER -