Multi-camera Tracking Based on Spatio-Temporal Association in Small Overlapping Regions | SpringerLink
Skip to main content

Multi-camera Tracking Based on Spatio-Temporal Association in Small Overlapping Regions

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2024)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 1018))

Included in the following conference series:

  • 176 Accesses

Abstract

Multi-camera tracking (MCT) aims to track people across multiple cameras. To match tracks across cameras, existing MCT solutions primarily rely on Person Re-Identification (Re-ID) that compares people’s visual appearance. However, this approach fails to match people with very similar appearances, such as people wearing uniforms in workplaces. In this paper, we propose a method based on spatio-temporal association (STA) to overcome the limitations of visual-based Re-ID in the problem of similar-appearance MCT. Our proposed method operates effectively when there are (even small) overlaps between cameras and a moderate number (i.e., maximum from 4 to 7 individuals) of people moving closely to each other in each overlapping region. We evaluate our proposed method on our prepared private dataset and the PETS2009 public one. The experimental results show that our proposed method matches people appearing in multiple cameras correctly and outperforms the MCT based on visual Re-ID method in case people have similar appearances, and it works well even if the overlapping region is small. To further strengthen the proposed method, we perform error analysis and introduce three extensions to mitigate the problems of missing detections and inaccurate footpoint interpolation. These three extensions further improve our proposed baseline method accuracy of the matching at frame level.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 20591
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
JPY 31459
Price includes VAT (Japan)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Han, X., et al.: MMPTRACK: large-scale densely annotated multi-camera multiple people tracking benchmark. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 4860–4869 (2023)

    Google Scholar 

  2. Olagoke, A.S., Ibrahim, H., Teoh, S.S.: Literature survey on multi-camera system and its application. IEEE Access 8, 172,892–172,922 (2020)

    Google Scholar 

  3. Khule, S., Jaybhay, S., Metkari, P., Balkhande, B.: Smart surveillance system real-time multi-person multi-camera tracking at the edge (2022)

    Google Scholar 

  4. Oğul, B.B.: A learning-based method for person re-identification. Master’s thesis, Middle East Technical University (2013)

    Google Scholar 

  5. Anjum, N., Cavallaro, A.: Trajectory association and fusion across partially overlapping cameras. In: 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 201–206 (2009)

    Google Scholar 

  6. Zhang, X., Izquierdo, E.: Real-time multi-target multi-camera tracking with spatial-temporal information. In: 2019 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4 (2019). https://doi.org/10.1109/VCIP47243.2019.8965845

  7. Chen, A.T.Y., Biglari-Abhari, M., Wang, K.I.K.: Fusing appearance and spatio-temporal models for person re-identification and tracking. J. Imaging 6, 27 (2020)

    Article  Google Scholar 

  8. Jang, J., Seon, M.J., Choi, J.: Lightweight indoor multi-object tracking in overlapping FOV multi-camera environments. Sensors (Basel, Switzerland) 22, 5267 (2022)

    Article  Google Scholar 

  9. Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking (2016)

    Google Scholar 

  10. Tao, R., Gavves, E., Smeulders, A.W.M.: Siamese instance search for tracking (2016)

    Google Scholar 

  11. Zhou, K., Yang, Y., Cavallaro, A., Xiang, T.: Omni-scale feature learning for person re-identification (2019)

    Google Scholar 

  12. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification (2017)

    Google Scholar 

  13. Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding (2015)

    Google Scholar 

  14. Dong, X., Shen, J.: Triplet loss in siamese network for object tracking. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 472–488. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_28

    Chapter  Google Scholar 

  15. Hsu, H.M., Huang, T.W., Wang, G., Cai, J., Lei, Z., Hwang, J.N.: Multi-camera tracking of vehicles based on deep features re-id and trajectory-based camera link models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2019)

    Google Scholar 

  16. Wu, C.W., Zhong, M.T., Tsao, Y., Yang, S.W., Chen, Y.K., Chien, S.Y.: Track-clustering error evaluation for track-based multi-camera tracking system employing human re-identification. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1416–1424 (2017). https://doi.org/10.1109/CVPRW.2017.184

  17. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision, chap. 13. In: Multiple View Geometry in Computer Vision. Cambridge University Press (2004)

    Google Scholar 

  18. Munkres, J.: Algorithms for the assignment and transportation problems. J. Soc. Ind. Appl. Math. 5(1), 32–38 (1957). https://doi.org/10.1137/0105003

    Article  MathSciNet  Google Scholar 

  19. Ferryman, J., Shahrokni, A.: PETS2009: dataset and challenge. In: 2009 Twelfth IEEE International Workshop on Performance Evaluation of Tracking and Surveillance, pp. 1–6 (2009). https://doi.org/10.1109/PETS-WINTER.2009.5399556

  20. Chavdarova, T., et al.: The WILDTRACK multi-camera person dataset. arXiv preprint arXiv:1707.09299 (2017)

  21. Xu, Y., Lin, L., Zheng, W.S., Liu, X.: Human re-identification by matching compositional template with cluster sampling. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  22. Han, X., et al.: MMPTRACK: large-scale densely annotated multi-camera multiple people tracking benchmark (2021)

    Google Scholar 

  23. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)

  24. Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box (2022)

    Google Scholar 

  25. WongKinYiu: Implementation of paper - YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. https://github.com/WongKinYiu/yolov7

  26. Luiten, J., et al.: HOTA: a higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis. 129(2), 548–578 (2020). https://doi.org/10.1007/s11263-020-01375-2

    Article  Google Scholar 

  27. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004). https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  28. Garrido-Jurado, S., Muñoz-Salinas, R., Madrid-Cuevas, F.J., Marín-Jiménez, M.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014). https://doi.org/10.1016/j.patcog.2014.01.005

    Article  Google Scholar 

  29. McNally, W., Vats, K., Wong, A., McPhee, J.: Rethinking keypoint representations: modeling keypoints and poses as objects for multi-person human pose estimation. arXiv preprint arXiv:2111.08557 (2021)

  30. Maji, D., Nagori, S., Mathew, M., Poddar, D.: YOLO-pose: enhancing YOLO for multi person pose estimation using object keypoint similarity loss (2022)

    Google Scholar 

Download references

Acknowledgment

The support for this research work from AWL, Inc. is gratefully appreciated. We also thank our colleagues in AWL Vietnam for their helpful support and discussion.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lap Quoc Tran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tran, L.Q., Pham, M.C., Nguyen, Q.N. (2024). Multi-camera Tracking Based on Spatio-Temporal Association in Small Overlapping Regions. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_33

Download citation

Publish with us

Policies and ethics