Abstract
Multiple object tracking (MOT) is an important yet challenging task in video understanding and analysis. Basically, MOT aims to associate detected objects into trajectories based on their temporal relationships. The occlusion among moving objects poses a major challenge towards robust modeling of these relationships. In this paper, we propose a novel Tracklet Siamese Network (TSN) for learning similarities between track-lets characterized by appearance information, achieving superior performance on two MOTChallenge benchmark datasets. Our framework constructs short tracklets from highly-related object detections by excluding inaccurate object detections. We also adopt a constrained clustering technique to piece tracklets together into long trajectories, thus recovering many missing detections caused by original detector or the detection removing in the previous step. Comparisons against state-of-the-art methods were reported while ablation studies further substantiate the viability of components in our approach.
Original language | English |
---|---|
Title of host publication | 2018 IEEE Visual Communications and Image Processing (VCIP) |
Publisher | IEEE |
ISBN (Electronic) | 9781538644584 |
DOIs | |
Publication status | Published - 25 Apr 2019 |
Event | 33rd IEEE International Conference on Visual Communications and Image Processing 2018 - Taichung, Taiwan, Province of China Duration: 9 Dec 2018 → 12 Dec 2018 |
Conference
Conference | 33rd IEEE International Conference on Visual Communications and Image Processing 2018 |
---|---|
Abbreviated title | VCIP 2018 |
Country/Territory | Taiwan, Province of China |
City | Taichung |
Period | 9/12/18 → 12/12/18 |
Keywords
- Constrained clustering
- Local temporal pooling
- Multiple object tracking
- Tracklet
- Tracklet siamese network
ASJC Scopus subject areas
- Computer Networks and Communications
- Signal Processing