COMET: Context-Aware IoU-Guided Network for Small Object Tracking

Marvasti-Zadeh, Seyed Mojtaba; Khaghani, Javad; Ghanei-Yakhdan, Hossein; Kasaei, Shohreh; Cheng, Li

doi:10.1007/978-3-030-69532-3_36

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12623))

Included in the following conference series:

Asian Conference on Computer Vision

1028 Accesses
21 Citations

Abstract

We consider the problem of tracking an unknown small target from aerial videos of medium to high altitudes. This is a challenging problem, which is even more pronounced in unavoidable scenarios of drastic camera motion and high density. To address this problem, we introduce a context-aware IoU-guided tracker (COMET) that exploits a multitask two-stream network and an offline reference proposal generation strategy. The proposed network fully exploits target-related information by multi-scale feature learning and attention modules. The proposed strategy introduces an efficient sampling strategy to generalize the network on the target and its parts without imposing extra computational complexity during online tracking. These strategies contribute considerably in handling significant occlusions and viewpoint changes. Empirically, COMET outperforms the state-of-the-arts in a range of aerial view datasets that focusing on tracking small objects. Specifically, COMET outperforms the celebrated ATOM tracker by an average margin of \(6.2\%\) (and \(7\%\)) in precision (and success) score on challenging benchmarks of UAVDT, VisDrone-2019, and Small-90.

S. M. Marvasti-Zadeh and J. Khaghani—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SiamFDA: feature dynamic activation siamese network for visual tracking

Article Open access 27 February 2024

FLSTrack: focused linear attention swin-transformer network with dual-branch decoder for end-to-end multi-object tracking

Article 02 December 2024

Hierarchical Representations with Discriminative Meta-filters in Dual Path Network for Tracking

References

Du, D., Zhu, P., Wen, L., Bian, X., Ling, H., et al.: VisDrone-SOT2019: the vision meets drone single object tracking challenge results. In: Proceedings of ICCVW (2019)
Google Scholar
Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: Proceedings of ECCV, pp. 375–391 (2018)
Google Scholar
Marvasti-Zadeh, S.M., Cheng, L., Ghanei-Yakhdan, H., Kasaei, S.: Deep learning for visual tracking: a comprehensive survey. IEEE Trans. Intell. Trans. Syst. 1–26 (2021). https://doi.org/10.1109/TITS.2020.3046478
Bonatti, R., Ho, C., Wang, W., Choudhury, S., Scherer, S.: Towards a robust aerial cinematography platform: localizing and tracking moving targets in unstructured environments. In: Proceedings of IROS, pp. 229–236 (2019)
Google Scholar
Zhang, H., Wang, G., Lei, Z., Hwang, J.: Eye in the sky: drone-based object tracking and 3D localization. In: Proceedings of Multimedia, pp. 899–907 (2019)
Google Scholar
Du, D., Zhu, P., et al.: VisDrone-SOT2019: the vision meets drone single object tracking challenge results. In: Proceedings of ICCVW (2019)
Google Scholar
Zhu, P., Wen, L., Du, D., Bian, X., Hu, Q., Ling, H.: Vision meets drones: past, present and future (2020)
Google Scholar
Zhu, P., Wen, L., Du, D., et al.: VisDrone-VDT2018: the vision meets drone video detection and tracking challenge results. In: Proceedings of ECCVW, pp. 496–518 (2018)
Google Scholar
Yu, H., Li, G., Zhang, W., et al.: The unmanned aerial vehicle benchmark: object detection, tracking and baseline. Int. J. Comput. 128(5), 1141–1159 (2019)
Google Scholar
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Chapter Google Scholar
Liu, C., Ding, W., Yang, J., et al.: Aggregation signature for small object tracking. IEEE Trans. Image Process. 29, 1738–1747 (2020)
Article MathSciNet Google Scholar
Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1834–1848 (2015)
Article Google Scholar
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans. Image Process. 24, 5630–5644 (2015)
Article MathSciNet Google Scholar
Tong, K., Wu, Y., Zhou, F.: Recent advances in small object detection based on deep learning: a review. Image Vis. Comput. 97, 103910 (2020)
Google Scholar
LaLonde, R., Zhang, D., Shah, M.: ClusterNet: detecting small objects in large scenes by exploiting spatio-temporal information. In: Proceedings of CVPR (2018)
Google Scholar
Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: SOD-MTGAN: small object detection via multi-task generative adversarial network. In: Proceedings of ECCV (2018)
Google Scholar
Huang, Z., Fu, C., Li, Y., Lin, F., Lu, P.: Learning aberrance repressed correlation filters for real-time UAV tracking. In: Proceedings of IEEE ICCV, pp. 2891–2900 (2019)
Google Scholar
Fu, C., Huang, Z., Li, Y., Duan, R., Lu, P.: Boundary effect-aware visual tracking for UAV with online enhanced background learning and multi-frame consensus verification. In: Proceedings of IROS, pp. 4415–4422 (2019)
Google Scholar
Li, F., Fu, C., Lin, F., Li, Y., Lu, P.: Training-set distillation for real-time UAV object tracking. In: Proceedings of ICRA, pp. 1–7 (2020)
Google Scholar
Li, Y., Fu, C., Huang, Z., Zhang, Y., Pan, J.: Keyfilter-aware real-time UAV object tracking. In: Proceedings of ICRA (2020)
Google Scholar
Li, Y., Fu, C., Ding, F., Huang, Z., Lu, G.: AutoTrack: towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In: Proceedings of IEEE CVPR (2020)
Google Scholar
Danelljan, M., Bhat, G., Khan, F.S., Felsberg, M.: ATOM: accurate tracking by overlap maximization. In: Proceedings of CVPR (2019)
Google Scholar
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 FPS with deep regression networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 749–765. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_45
Chapter Google Scholar
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 850–865. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48881-3_56
Chapter Google Scholar
Valmadre, J., Bertinetto, L., Henriques, J., Vedaldi, A., Torr, P.H.: End-to-end representation learning for correlation filter based tracking. In: Proceedings of IEEE CVPR, pp. 5000–5008 (2017)
Google Scholar
Dong, X., Shen, J.: Triplet loss in Siamese network for object tracking. In: Proceedings of ECCV, pp. 472–488 (2018)
Google Scholar
Danelljan, M., Robinson, A., Khan, F.S., Felsberg, M.: Beyond correlation filters: learning continuous convolution operators for visual tracking. In: Proceedings of ECCV, pp. 472–488 (2016)
Google Scholar
Danelljan, M., Bhat, G., Shahbaz Khan, F., Felsberg, M.: ECO: efficient convolution operators for tracking. In: Proceedings of IEEE CVPR, pp. 6931–6939 (2017)
Google Scholar
Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., Hu, W.: Distractor-aware Siamese networks for visual object tracking. In: Proceedings of ECCV, pp. 103–119 (2018)
Google Scholar
Zhang, Z., Peng, H.: Deeper and wider Siamese networks for real-time visual tracking (2019)
Google Scholar
Fan, H., Ling, H.: Siamese cascaded region proposal networks for real-time visual tracking (2018)
Google Scholar
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., Torr, P.H.: Fast online object tracking and segmentation: a unifying approach. In: Proceedings of IEEE CVPR (2019)
Google Scholar
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: Proceedings of IEEE CVPR (2019)
Google Scholar
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: Proceedings of IEEE CVPR, pp. 8971–8980 (2018)
Google Scholar
Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: Proceedings of IEEE ICCV (2019)
Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Proceedings of ECCV, pp. 21–37 (2016)
Google Scholar
Fu, C., Liu, W., Ranga, A., Tyagi, A., Berg, A.: DSSD: deconvolutional single shot detector (2017)
Google Scholar
Cui, L., et al.: MDSSD: multi-scale deconvolutional single shot detector for small objects (2018)
Google Scholar
Lim, J.S., Astrid, M., Yoon, H.J., Lee, S.I.: Small object detection using context and attention (2019)
Google Scholar
Yang, X., et al.: SCRDet: towards more robust detection for small, cluttered and rotated objects. In: Proceedings IEEE ICCV (2019)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of IEEE CVPR, pp. 580–587 (2014)
Google Scholar
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: Proceedings of IEEE ECCV, pp. 816–832 (2018)
Google Scholar
Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: BAM: bottleneck attention module. In: Proceedings of BMVC, pp. 147–161 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE CVPR, pp. 770–778 (2016)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of CVPR, pp. 2818–2826 (2016)
Google Scholar
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020). https://doi.org/10.1109/TPAMI.2019.2913372
Article Google Scholar
Fan, H., et al.: LaSOT: a high-quality benchmark for large-scale single object tracking. In: Proceedings of IEEE CVPR (2019)
Google Scholar
Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019). https://doi.org/10.1109/TPAMI.2019.2957464
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of ICCV, pp. 1026–1034 (2015)
Google Scholar
Galoogahi, H.K., Fagg, A., Huang, C., Ramanan, D., Lucey, S.: Need for speed: a benchmark for higher frame rate object tracking. In: Proceedings of IEEE ICCV, pp. 1134–1143 (2017)
Google Scholar
Kingma, D.P., Ba, J.: ADAM: a method for stochastic optimization. In: Proceedings of ICLR (2014)
Google Scholar
Danelljan, M., Gool, L.V., Timofte, R.: Probabilistic regression for visual tracking. In: Proceedings of IEEE CVPR (2020)
Google Scholar
Zhang, Z., Peng, H., Fu, J., Li, B., Hu, W.: Ocean: object-aware anchor-free tracking. In: Proceedings of ECCV (2020)
Google Scholar
Song, Y., Ma, C., Gong, L., Zhang, J., Lau, R.W., Yang, M.H.: CREST: convolutional residual learning for visual tracking. In: Proceedings of ICCV, pp. 2574–2583 (2017)
Google Scholar
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of IEEE CVPR, pp. 4293–4302 (2016)
Google Scholar
Fan, H., Ling, H.: Parallel tracking and verifying. IEEE Trans. Image Process. 28, 4130–4144 (2019)
Article Google Scholar
Zhang, T., Xu, C., Yang, M.H.: Multi-task correlation particle filter for robust object tracking. In: Proceedings of IEEE CVPR, pp. 4819–4827 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Alberta, Edmonton, Canada
Seyed Mojtaba Marvasti-Zadeh, Javad Khaghani & Li Cheng
Yazd University, Yazd, Iran
Seyed Mojtaba Marvasti-Zadeh & Hossein Ghanei-Yakhdan
Sharif University of Technology, Tehran, Iran
Seyed Mojtaba Marvasti-Zadeh & Shohreh Kasaei

Authors

Seyed Mojtaba Marvasti-Zadeh
View author publications
You can also search for this author in PubMed Google Scholar
Javad Khaghani
View author publications
You can also search for this author in PubMed Google Scholar
Hossein Ghanei-Yakhdan
View author publications
You can also search for this author in PubMed Google Scholar
Shohreh Kasaei
View author publications
You can also search for this author in PubMed Google Scholar
Li Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seyed Mojtaba Marvasti-Zadeh .

Editor information

Editors and Affiliations

Waseda University, Tokyo, Japan
Hiroshi Ishikawa
Institute of Automation of Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
Czech Technical University in Prague, Prague, Czech Republic
Tomas Pajdla
University of Pennsylvania, Philadelphia, PA, USA
Jianbo Shi

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 93098 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marvasti-Zadeh, S.M., Khaghani, J., Ghanei-Yakhdan, H., Kasaei, S., Cheng, L. (2021). COMET: Context-Aware IoU-Guided Network for Small Object Tracking. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12623. Springer, Cham. https://doi.org/10.1007/978-3-030-69532-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-69532-3_36
Published: 27 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69531-6
Online ISBN: 978-3-030-69532-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics