Cross-attention Guided Siamese Network Object Tracking Algorithm

Computer Science ›› 2022, Vol. 49 ›› Issue (3): 163-169.doi: 10.11896/jsjkx.210300066

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Cross-attention Guided Siamese Network Object Tracking Algorithm

ZHAO Yue, YU Zhi-bin, LI Yong-chun   

  1. College of Electronic Engineering,Southwest Jiaotong University,Chengdu 611756,China
  • Received:2021-03-08 Revised:2021-04-13 Online:2022-03-15 Published:2022-03-15
  • About author:ZHAO Yue,born in 1995,postgraduate.His main research interests include artificial intelligence,pattern recognition and computer vision.
    YU Zhi-bin,born in 1977,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include artificial intelligence,pattern recognition and signal processing.
  • Supported by:
    National Defense Pre-Research Foundation of China(61403120304).

Abstract: Most traditional Siamese trackers cannot perform robust when facing the similar object,deformation,background clutters and other challenges.Accordingly,a cross-attention guided Siamese network (called SiamCAN) is proposed to solve the above problem in this paper.Firstly,different layers of ResNet50 are used to get various revolutions of object feature and a cross-attention module is designed to bridge the information flow between search branch and template branch.After that,each feature from different layers of backbone is sent to CNNs to update parameters and combined with each other,in classification network and regression network.Finally,the predicted location and target size are calculated according to the max response on response map.Simulation experimental results on the UAV123 tracking dataset show that the tracking precision is improved by 1.7% and the tracking accuracy is improved by 0.7%,compared to the mainstream algorithm SiamBAN.Moreover,on the VOT2018 benchmark,the EAO of our method outperforms 2.5 than the mainstream algorithm SiamRPN++,and the tracking speed of our method maintains 35FPS.

Key words: Anchor-free regression, Cross-attention module, Siamese network, Similar object distractor, Visual object tracking

CLC Number: 

  • TP391
[1]BERTINETTO L,VALMADRE J,HENRIQUES J F,et al.Fully-convolutional siamese networks for object tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:850-865.
[2]LI B,YAN J,WU W,et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8971-8980.
[3]ZHU Z,WANG Q,LI B,et al.Distractor-aware siamese net-works for visual object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:101-117.
[4]LI B,WU W,WANG Q,et al.Siamrpn++:Evolution of siamese visual tracking with very deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:4282-4291.
[5]ZHANG Z,PENG H.Deeper and wider siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2019:4591-4600.
[6]HE A,LUO C,TIAN X,et al.A twofold siamese network forreal-time object tracking[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:4834-4843.
[7]ABDELPAKEY M H,SHEHATA M S,MOHAMED M M.Denssiam:End-to-end densely-Siamese network with self-attention model for object tracking[C]//International Symposium on Visual Computing.Cham:Springer,2018:463-473.
[8]WANG Q,TENG Z,XING J,et al.Learning attentions:residual attentional siamese network for high performance online visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4854-4863.
[9]XU Y,WANG Z,LI Z,et al.SiamFC++:Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines[J].Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(7):12549-12556.
[10]GUO D,WANG J,CUI Y,et al.SiamCAR:Siamese Fully Con-volutional Classification and Regression for Visual Tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6269-6277.
[11]CHEN Z,ZHONG B,LI G,et al.Siamese Box Adaptive Network for Visual Tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6668-6677.
[12]WANG N,SHI J,YEUNG D Y,et al.Understanding and diagnosing visual tracking systems[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:3101-3109.
[13]ZHENG Z,WANG P,LIU W,et al.Distance-IoU Loss:Faster and Better Learning for Bounding Box Regression[C]//AAAI.2020:12993-13000.
[14]REZATOFIGHI H,TSOI N,GWAK J Y,et al.Generalized intersection over union:A metric and a loss for bounding box regression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:658-666.
[15]RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet LargeScale Visual Recognition Challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[16]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
[17]REAL E,SHLENS J,MAZZOCCHI S,et al.Youtube-boun-dingboxes:A large high-precision human-annotated data set for object detection in video[C]//proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:5296-5305.
[18]RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet LargeScale Visual Recognition Challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[19]HUANG L,ZHAO X,HUANG K.Got-10k:A large high-diversity benchmark for generic object tracking in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019.
[20]MUELLER M,SMITH N,GHANEM B.A Benchmark andSimulator for UAV Tracking[C]//European Conference on Computer Vision.Cham:Springer,2016:445-461.
[21]DANELLJAN M,BHAT G,SHAHBAZ KHAN F,et al.Eco:Efficient convolution operators for tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:6638-6646.
[22]DANELLJAN M,HAGER G,KHAN F S,et al.Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE international conference on computer vision.2015:4310-4318.
[23]KRISTAN M,LEONARDIS A,MATAS J,et al.The sixthvisual object tracking vot2018 challenge results[C]//Procee-dings of the European Conference on Computer Vision (ECCV).2018.
[24]BAI S,HE Z,DONG Y,et al.Multi-hierarchical independentcorrelation filters for visual tracking[C]//2020 IEEE International Conference on Multimedia and Expo (ICME).IEEE,2020:1-6.
[25]WANG G,LUO C,SUN X,et al.Tracking by instance detection:A meta-learning approach[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:6288-6297.
[26]DANELLJAN M,BHAT G,KHAN F S,et al.Atom:Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:4660-4669.
[1] CHEN Kun-feng, PAN Zhi-song, WANG Jia-bao, SHI Lei, ZHANG Jin. Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation [J]. Computer Science, 2022, 49(8): 165-171.
[2] CHENG Xu, CUI Yi-ping, SONG Chen, CHEN Bei-jing, ZHENG Yu-hui, SHI Jin-gang. Object Tracking Algorithm Based on Temporal-Spatial Attention Mechanism [J]. Computer Science, 2021, 48(4): 123-129.
[3] ZHANG Kai-hua, FAN Jia-qing, LIU Qing-shan. Advances on Visual Object Tracking in Past Decade [J]. Computer Science, 2021, 48(3): 40-49.
[4] LI Jian-peng, SHANG Zhen-hong, LIU Hui. Visual Object Tracking Algorithm Based on Correlation Filters with Hierarchical Convolutional Features [J]. Computer Science, 2019, 46(7): 252-257.
[5] . [J]. Computer Science, 2007, 34(2): 253-255.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!