基于YOLOv3与分层数据关联的多目标跟踪算法

计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 370-375.doi: 10.11896/jsjkx.201000115

• 图像处理& 多媒体技术 • 上一篇    下一篇

基于YOLOv3与分层数据关联的多目标跟踪算法

刘彦1,2, 秦品乐1, 曾建朝1,2   

  1. 1 中北大学山西省医学影像人工智能工程技术研究中心 太原030051
    2 中北大学电气与控制工程学院 太原030051
  • 出版日期:2021-11-10 发布日期:2021-11-12
  • 通讯作者: 曾建朝(zjc@nuc.edu.cn)
  • 作者简介:ly950322@126.com
  • 基金资助:
    山西省重点研发项目(201803D31212-1)

Multi-object Tracking Algorithm Based on YOLOv3 and Hierarchical Data Association

LIU Yan1,2, QIN Pin-le1, ZENG Jian-chao1,2   

  1. 1 Shanxi Medical Imaging and Data Analysis Engineering Research Center,North University of China,Taiyuan 030051,China
    2 School of Electrical and Control Engineering,North University of China,Taiyuan 030051,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:LIU Yan,born in 1995,master's degree student,is a member of China ComputerFederation.Her main research interests include multi-target tracking,digitalimage processing,computer vision.
    ZENG Jian-chao,born in 1963,Ph.D,is a member of China Computer Federation.His main research interests include medical image and maintenance decision of complex system.
  • Supported by:
    Shanxi Provincial Key Research and Development Plan(201803D31212-1).

摘要: 为了缓解多目标跟踪算法中实时性的问题以及在跟踪过程中目标由于外观相似度太高和误检数量过多而造成的跟踪困难问题,提出了一种多目标跟踪算法,该算法基于改进YOLOv3与分层数据关联。由于轻量级网络MobileNet使用了深度可分离卷积对原有网络进行压缩,达到了减少网络参数的目的,因此文中在保留YOLOv3网络多尺度预测部分的情况下,利用MobileNet替换YOLOv3网络的主体结构,实现降低网络的复杂度,使算法达到实时的要求。与其他多目标跟踪算法中使用的检测网络相比,该算法提出的检测网络模型的大小为91 M,而单张检测时间可以达到3.12 s。同时,该算法引入基于目标外观特征和运动特征的分层数据关联方法。与仅使用外观特征进行关联的方法相比,分层数据关联方法使得算法的评价指标MOTA提升6.5%,MOTP提升1.7%。在MOT16数据集上跟踪精度可以达到77.2%,同时具备良好的抗干扰能力与实时性。

关键词: YOLOv3, 多目标跟踪, 分层数据关联, 轻量级网络, 深度学习

Abstract: In order to alleviate the real-time problem of multi-object tracking methods and the tracking difficulty caused by the high similarity of appearance and the excessive number of error detection in the tracking process,a new multi-object tracking method is proposed,which is based on the improved YOLOv3 and hierarchical data association.As the lightweight network MobileNet uses the deep separable convolution to compress the original network,so as to reduce the network parameters,we uses MobileNet to replace the main structure of YOLOv3 network while retaining the multi-scale prediction part of YOLOv3,so as to reduce the complexity of the network and make the method meet the real-time requirements.Compared with the detection network used in other multi-object tracking methods,we proposed detection network model size is 91 M,and the single detection time can reach 3.12 s.At the same time,the algorithm introduces hierarchical data association method based on object appearance features and motion features.Compared with the method using only appearance features,the hierarchical data association method improves the evaluation index MOTA by 6.5 and MOTP by 1.7.On the MOT16 data set,the tracking accuracy can reach 77.2% and has good anti-jamming ability and real-time performance.

Key words: Deep learning, Hierarchical data association, Lightweight network, Multi-object tracking, YOLOv3

中图分类号: 

  • TP391.41
[1]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of Computer Vision and Pattern Recognition (CVPR).Washington DC:IEEE Computer Society Press,2015:779-788.
[2]REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of Computer Vision and Pattern Recognition (CVPR).Washington DC:IEEE Computer Society Press,2017:6517-6525.
[3]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[C]//Proceedings of Computer Vision and Pattern Recognition (CVPR).2018:1-6.
[4]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single Shot Multi Box Detector[C]//Proceedings of European Conference on Computer Vision (ECCV).New York:Springer Press,2016:21-37.
[5]GIRSHICK R.Fast R-CNN[C]//Proceedings of International Conference on Computer Vision (ICCV).Los Alamitos:IEEE Computer Society Press,2015:1440-1448.
[6]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[C]//Proceedings of Annual Conference on Neural Information Processing Systems (NIPS).2015:91-99.
[7]REN J,GONG N S,HAN Z Y.Multi target tracking algorithm based on yolov3 and Kalman filter[J].Computer Applications and Software,2020,37(5):169-176.
[8]XU Y J.Video multi-target pedestrian detection and trackingbased on deep learning[J].Modern Information Technology,2020(12).
[9]WU L,YUE H,CHEN P,et al.A Novel Dynamic Network Pruning via Smooth Initialization and Its Potential Applications in Machine Learning Based Security Solutions[J].IEEE Access,2019,7:91667-91678.
[10]WANG S S,WANG M,WANG G Y.Deep Neural NetworkPruning Based Two-Stage Remote Sensing Image Object Detection[J].Journal of Northeastern University(Natural Science),2019,40(2):174-179.
[11]POLYAK A,WOLF L.Channel-Level Acceleration of DeepFace Representations[J].IEEE Access,2015,3:2163-2175.
[12]ZHANG X Y,ZOU J H,HE K M,et al.Acceleraring Very Deep Convolutional Networks for Classification and Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):1943-1955.
[13]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[Z].arXiv:1704.04861,2017.
[14]IANDOLA F N,MOSKEWICZ M W,ASHRAF K,et al.Squeezenet:Alexnet-level accuracy with 50x fewer parameters and 1mb model size[J].arXiv:1602.07360,2016.
[15]BAE S H,YOON K J.Robust online multi-object tracking based on tracklet confidence and online discriminative appearance learning[C]///Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2014:1218-1225.
[16]HUANG C,LI Y,NEVATIA R.Multiple target tracking bylearning-based hierarchical association of detection responses[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(4):898-910.
[17]MILAN A,SCHINDLER K,ROTH S.Multi-target tracking by discrete-continuous energy minimization[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):2054-2068.
[18]HE Z Y,CUI Y X,WANG H P,et al.One global optimization method in network flow model for multiple object tracking[J].Knowledge-Based Systems,2015,86:21-32.
[19]WANG M M,LIU P X,LI X F,et al.Multi-target trackingthrough occlusions using extended Kalman filter and network flows[C] //Proceedings of the 2nd IEEE International Conference on Computer and Communications.Los Alamitos:IEEE Computer Society Press,2016:2611-2617.
[20]ZHU S H,SHI Z,SUN C J.Tracklet association based multi-target tracking[J].Multimedia Tools and Applications,2016,75(15):9489-9506.
[21]YOON J H,YANG M H,LIM J,et al.Bayesian Multi-Object Tracking Using Motion Context from Multiple Objects[C]//Winter Conference on Applications of Computer Vision (WACV).Waikoloa,HI,USA:IEEE Computer Society,Los Alamitos,CA,USA,2015:33-40.
[22]XIANG Y,ALAHI A,SAVARESE S.Learning to Track:Online Multi-Object Tracking by Decision Making[C]//Proceedings of International Conference on Computer Vision (ICCV).Santiago,Chile:IEEE Computer Society,Los Alamitos,CA,USA,2015:4705-4713.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 沈祥培, 丁彦蕊.
多检测器融合的深度相关滤波视频多目标跟踪算法
Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm
计算机科学, 2022, 49(8): 184-190. https://doi.org/10.11896/jsjkx.210600004
[9] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[15] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!