{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,22]],"date-time":"2025-03-22T12:31:15Z","timestamp":1742646675483,"version":"3.37.3"},"reference-count":143,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T00:00:00Z","timestamp":1716854400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"Object tracking is one of the most important problems in computer vision applications such as robotics, autonomous driving, and pedestrian movement. There has been a significant development in camera hardware where researchers are experimenting with the fusion of different sensors and developing image processing algorithms to track objects. Image processing and deep learning methods have significantly progressed in the last few decades. Different data association methods accompanied by image processing and deep learning are becoming crucial in object tracking tasks. The data requirement for deep learning methods has led to different public datasets that allow researchers to benchmark their methods. While there has been an improvement in object tracking methods, technology, and the availability of annotated object tracking datasets, there is still scope for improvement. This review contributes by systemically identifying different sensor equipment, datasets, methods, and applications, providing a taxonomy about the literature and the strengths and limitations of different approaches, thereby providing guidelines for selecting equipment, methods, and applications. Research questions and future scope to address the unresolved issues in the object tracking field are also presented with research direction guidelines.<\/jats:p>","DOI":"10.3390\/computers13060136","type":"journal-article","created":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T17:32:55Z","timestamp":1716917575000},"page":"136","source":"Crossref","is-referenced-by-count":6,"title":["Object Tracking Using Computer Vision: A Review"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9696-6189","authenticated-orcid":false,"given":"Pushkar","family":"Kadam","sequence":"first","affiliation":[{"name":"School of Engineering, Design and Built Environment, Western Sydney University, Locked Bag 1797, Penrith, NSW 2751, Australia"}]},{"given":"Gu","family":"Fang","sequence":"additional","affiliation":[{"name":"School of Engineering, Design and Built Environment, Western Sydney University, Locked Bag 1797, Penrith, NSW 2751, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5091-7309","authenticated-orcid":false,"given":"Ju Jia","family":"Zou","sequence":"additional","affiliation":[{"name":"School of Engineering, Design and Built Environment, Western Sydney University, Locked Bag 1797, Penrith, NSW 2751, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,28]]},"reference":[{"key":"ref_1","first-page":"205105","article-title":"DyStSLAM: An efficient stereo vision SLAM system in dynamic environment","volume":"34","author":"Li","year":"2023","journal-title":"Meas. Sci. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"106007","DOI":"10.1016\/j.compag.2021.106007","article-title":"Dynamic tree branch tracking for aerial canopy sampling using stereo vision","volume":"182","author":"Busch","year":"2021","journal-title":"Comput. Electron. Agric."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1401","DOI":"10.1109\/TRO.2021.3061364","article-title":"Spatiotemporal Multisensor Calibration via Gaussian Processes Moving Target Tracking","volume":"37","author":"Persic","year":"2021","journal-title":"IEEE Trans. Robot."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2818","DOI":"10.1109\/TAES.2013.6621857","article-title":"6 Degree-of-Freedom Motion Estimation of a Moving Target using Monocular Image Sequences","volume":"49","author":"Kwon","year":"2013","journal-title":"IEEE Trans. Aerosp. Electron. Syst."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3291011","DOI":"10.1109\/TIM.2023.3291011","article-title":"VIMOT: A Tightly Coupled Estimator for Stereo Visual-Inertial Navigation and Multiobject Tracking","volume":"72","author":"Feng","year":"2023","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1049\/cvi2.12206","article-title":"SA-FlowNet: Event-based self-attention optical flow estimation with spiking-analogue neural networks","volume":"17","author":"Yang","year":"2023","journal-title":"IET Comput. Vision"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Shen, Y., Liu, Y., Tian, Y., Liu, Z., and Wang, F. (2022). A New Parallel Intelligence Based Light Field Dataset for Depth Refinement and Scene Flow Estimation. Sensors, 22.","DOI":"10.3390\/s22239483"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"11714","DOI":"10.1109\/JSEN.2019.2937304","article-title":"A Combined Vision-Based Multiple Object Tracking and Visual Odometry System","volume":"19","author":"Aladem","year":"2019","journal-title":"IEEE Sens. J."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1354316","DOI":"10.1155\/2018\/1354316","article-title":"Illumination invariant motion detection and tracking using SMDWT and a dense disparity-variance method","volume":"2018","author":"Deepambika","year":"2018","journal-title":"J. Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1016\/j.robot.2016.05.001","article-title":"Radar and stereo vision fusion for multitarget tracking on the special Euclidean group","volume":"83","year":"2016","journal-title":"Robot. Auton. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1109\/TCSVT.2014.2357093","article-title":"Tracking live fish from low-contrast and low-frame-rate stereo videos","volume":"25","author":"Chuang","year":"2015","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2002","DOI":"10.1109\/TBME.2022.3233909","article-title":"Soft Tissue Monitoring of the Surgical Field: Detection and Tracking of Breast Surface Deformations","volume":"70","author":"Richey","year":"2023","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Gionfrida, L., Rusli, W., Bharath, A., and Kedgley, A. (2022). Validation of two-dimensional video-based inference of finger kinematics with pose estimation. PLoS ONE, 17.","DOI":"10.1101\/2022.06.22.497125"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.compmedimag.2017.07.001","article-title":"Biopsy needle tracking technique in US images","volume":"65","author":"Czajkowska","year":"2018","journal-title":"Comput. Med. Imaging Graph."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.neucom.2016.01.122","article-title":"3D character recognition using binocular camera for medical assist","volume":"220","author":"Yang","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"847","DOI":"10.1109\/TPAMI.2014.2353638","article-title":"Stereo reconstruction of droplet flight trajectories","volume":"37","author":"Zarrabeitia","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","first-page":"1","article-title":"A survey of appearance models in visual object tracking","volume":"4","author":"Li","year":"2013","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"113711","DOI":"10.1016\/j.eswa.2020.113711","article-title":"Recent trends in multicue based visual tracking: A review","volume":"162","author":"Kumar","year":"2020","journal-title":"Expert Syst. Appl."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Park, Y., Dang, L.M., Lee, S., Han, D., and Moon, H. (2021). Multiple object tracking in deep learning approaches: A survey. Electronics, 10.","DOI":"10.3390\/electronics10192406"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"32650","DOI":"10.1109\/ACCESS.2021.3060821","article-title":"Analysis Based on Recent Deep Learning Approaches Applied in Real-Time Multi-Object Tracking: A Review","volume":"9","author":"Kalake","year":"2021","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"6101","DOI":"10.1109\/TITS.2021.3077883","article-title":"An Empirical Review of Deep Learning Frameworks for Change Detection: Model Design, Experimental Frameworks, Challenges and Research Needs","volume":"23","author":"Mandal","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Guo, S., Wang, S., Yang, Z., Wang, L., Zhang, H., Guo, P., Gao, Y., and Guo, J. (2022). A Review of Deep Learning-Based Visual Multi-Object Tracking Algorithms for Autonomous Driving. Appl. Sci., 12.","DOI":"10.3390\/app122110741"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"102317","DOI":"10.1016\/j.displa.2022.102317","article-title":"A survey of detection-based video multi-object tracking","volume":"75","author":"Dai","year":"2022","journal-title":"Displays"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"116300","DOI":"10.1016\/j.eswa.2021.116300","article-title":"Data association in multiple object tracking: A survey of recent techniques","volume":"192","author":"Rakai","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1007\/s11633-022-1344-1","article-title":"Long-term Visual Tracking: Review and Experimental Comparison","volume":"19","author":"Liu","year":"2022","journal-title":"Mach. Intell. Res."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Rocha, R.d.L., and de Figueiredo, F.A.P. (2023). Beyond Land: A Review of Benchmarking Datasets, Algorithms, and Metrics for Visual-Based Ship Tracking. Electronics, 12.","DOI":"10.3390\/electronics12132789"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"31869","DOI":"10.3390\/s151229892","article-title":"Quantitative evaluation of stereo visual odometry for autonomous vessel localisation in inland waterway sensing applications","volume":"15","author":"Kriechbaumer","year":"2015","journal-title":"Sensors"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1016\/j.oceaneng.2017.01.024","article-title":"Stereovision-based target tracking system for USV operations","volume":"133","author":"Sinisterra","year":"2017","journal-title":"Ocean Eng."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gennaro, T.D., and Waldmann, J. (2023). Sensor Fusion with Asynchronous Decentralized Processing for 3D Target Tracking with a Wireless Camera Network. Sensors, 23.","DOI":"10.3390\/s23031194"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Hartley, R., and Zisserman, A. (2004). Multiple View Geometry in Computer Vision, Cambridge University Press. [2nd ed.].","DOI":"10.1017\/CBO9780511811685"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1369","DOI":"10.1007\/s11760-021-01867-9","article-title":"High-speed moving target tracking of multi-camera system with overlapped field of view","volume":"15","author":"Yan","year":"2021","journal-title":"Signal Image Video Process"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1177\/0142331220921318","article-title":"An improved method for swing measurement based on monocular vision to the payload of overhead crane","volume":"44","author":"Huang","year":"2022","journal-title":"Trans. Inst. Meas. Control"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/MMUL.2012.24","article-title":"Microsoft Kinect Sensor and Its Effect","volume":"19","author":"Zhang","year":"2012","journal-title":"IEEE MultiMedia"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1177\/0278364913491297","article-title":"Vision meets robotics: The KITTI dataset","volume":"32","author":"Geiger","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1109\/TSMCA.2012.2220540","article-title":"Tracking people motion based on extended condensation algorithm","volume":"43","author":"Gardel","year":"2013","journal-title":"IEEE Trans. Syst. Man Cybern. Part A Syst. Hum."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1339","DOI":"10.1016\/j.sigpro.2017.04.008","article-title":"Robust object tracking via multi-cue fusion","volume":"139","author":"Hu","year":"2017","journal-title":"Signal Process"},{"key":"ref_38","unstructured":"Bouguet, J.Y. (2024, February 27). Camera Calibration Toolbox for Matlab. Available online: https:\/\/data.caltech.edu\/records\/jx9cx-fdh55."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"62043","DOI":"10.1109\/ACCESS.2021.3074413","article-title":"Vision-Based Target Detection and Tracking System for a Quadcopter","volume":"9","author":"Wu","year":"2021","journal-title":"IEEE Access"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Rasoulidanesh, M., Yadav, S., Herath, S., Vaghei, Y., and Payandeh, S. (2019). Deep attention models for human tracking using RGBD. Sensors, 19.","DOI":"10.3390\/s19040750"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Song, S., and Xiao, J. (2013, January 1\u20138). Tracking Revisited using RGBD Camera: Unified Benchmark and Baselines. Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.36"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1226","DOI":"10.1109\/TASE.2022.3176294","article-title":"Detection, Localization, and Tracking of Multiple MAVs with Panoramic Stereo Camera Networks","volume":"20","author":"Zheng","year":"2023","journal-title":"IEEE Trans. Autom. Sci. Eng."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1109\/JSTSP.2022.3211198","article-title":"Fusion of Inverse Synthetic Aperture Radar and Camera Images for Automotive Target Tracking","volume":"17","author":"Ram","year":"2023","journal-title":"IEEE J. Sel. Top. Signal Process"},{"key":"ref_44","first-page":"35","article-title":"A New framework of moving object tracking based on object detection-tracking with removal of moving features","volume":"11","author":"Ngoc","year":"2020","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1007\/s11263-009-0273-6","article-title":"HUMANEVA: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion","volume":"87","author":"Sigal","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Mdfaa, M.A., Kulathunga, G., and Klimchik, A. (2022). 3D-SiamMask: Vision-Based Multi-Rotor Aerial-Vehicle Tracking for a Moving Object. Remote Sens., 14.","DOI":"10.3390\/rs14225756"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"11568","DOI":"10.1109\/TITS.2023.3292278","article-title":"Vehicle Detection for Autonomous Driving: A Review of Algorithms and Datasets","volume":"24","author":"Karangwa","year":"2023","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Flohr, F., and Gavrila, D. (2013, January 9\u201313). PedCut: An iterative framework for pedestrian segmentation combining shape models and multiple data cues. Proceedings of the British Machine Vision Conference (BMVC), Bristol, UK.","DOI":"10.5244\/C.27.66"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"2800793","DOI":"10.1109\/LRA.2018.2800793","article-title":"The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception","volume":"3","author":"Zhu","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Nikolic, J., Rehder, J., Burri, M., Gohl, P., Leutenegger, S., Furgale, P.T., and Siegwart, R. (June, January 31). A synchronized visual-inertial sensor system with FPGA pre-processing for accurate real-time SLAM. Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China.","DOI":"10.1109\/ICRA.2014.6906892"},{"key":"ref_51","first-page":"19","article-title":"A dataset and evaluation methodology for depth estimation on 4D light fields","volume":"Volume 10113","author":"Honauer","year":"2017","journal-title":"Computer Vision\u2013ACCV 2016, Proceedings of the 13th Asian Conference on Computer Vision, Taipei, Taiwan, 20\u201324 November 2016"},{"key":"ref_52","first-page":"431","article-title":"The Tenth Visual Object Tracking VOT2022 Challenge Results","volume":"Volume 13808","author":"Kristan","year":"2023","journal-title":"Proceedings of the European Conference on Computer Vision"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1834","DOI":"10.1109\/TPAMI.2014.2388226","article-title":"Object tracking benchmark","volume":"37","author":"Wu","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Pauwels, K., Rubio, L., D\u00edaz, J., and Ros, E. (2013, January 23\u201328). Real-time Model-based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.304"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1177\/0278364912445831","article-title":"The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics","volume":"31","author":"Kasper","year":"2012","journal-title":"Int. J. Robot. Res."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"5159","DOI":"10.1109\/LRA.2020.3003866","article-title":"Seeing through the Occluders: Robust Monocular 6-DOF Object Pose Tracking via Model-Guided Video Object Segmentation","volume":"5","author":"Zhong","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Krull, A., Michel, F., Brachmann, E., Gumhold, S., Ihrke, S., and Rother, C. (2014, January 1\u20135). 6-DOF Model Based Tracking via Object Coordinate Regression. Proceedings of the Computer Vision\u2014ACCV, Singapore.","DOI":"10.1007\/978-3-319-16817-3_25"},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"104141","DOI":"10.1016\/j.autcon.2022.104141","article-title":"Development of training image database using web crawling for vision-based site monitoring","volume":"135","author":"Hwang","year":"2022","journal-title":"Autom. Constr."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Krause, J., Stark, M., Deng, J., and Li, F.-F. (2013, January 2\u20138). 3D Object Representations for Fine-Grained Categorization. Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia.","DOI":"10.1109\/ICCVW.2013.77"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Cimpoi, M., Maji, S., Kokkinos\u00e9cole, I., Mohamed, S., and Vedaldi, A. (2014, January 23\u201328). Describing Textures in the Wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.461"},{"key":"ref_61","unstructured":"Zauner, C. (2024, February 27). Implementation and Benchmarking of Perceptual Image Hash Functions. Available online: http:\/\/www.phash.org\/docs\/pubs\/thesis_zauner.pdf."},{"key":"ref_62","first-page":"79","article-title":"The Visual Object Tracking VOT2015 challenge results 2015 IEEE International Conference on Computer Vision Workshop 2015 IEEE International Conference on Computer Vision Workshop","volume":"32","author":"Kristan","year":"2015","journal-title":"Chin. Acad. Sci."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1007\/978-3-319-48881-3_54","article-title":"The visual object tracking VOT2016 challenge results","volume":"Volume 9914","author":"Kristan","year":"2016","journal-title":"Computer Vision\u2013ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8\u201310 and 15\u201316, 2016, Proceedings, Part II"},{"key":"ref_64","unstructured":"Kristan, M., Leonardis, A., Matas, J., Felsberg, M., Pflugfelder, R., \u010cehovin Zajc, L., Voj\u00edr, T., Bhat, G., Luke\u017ei\u010d, A., and Eldesokey, A. (2018, January 8\u201314). The sixth visual object tracking VOT2018 challenge results. Proceedings of the Computer Vision\u2014ECCV 2018 Workshops, Munich, Germany. Lecture Notes in Computer Science."},{"key":"ref_65","unstructured":"Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Pflugfelder, R., K\u00e4m\u00e4r\u00e4inen, J.K., Zajc, L.C., Drbohlav, O., Lukezic, A., and Berg, A. (2019, January 27\u201328). The seventh visual object tracking VOT2019 challenge results. Proceedings of the 2019 International Conference on Computer Vision Workshop, ICCVW 2019, Seoul, Republic of Korea."},{"key":"ref_66","unstructured":"Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., Roth, S., Schindler, K., Leal-Taix\u00e9, L., and Taix\u00e9, T. (2020). MOT20: A Benchmark for Multi Object Tracking in Crowded Scenes. arXiv."},{"key":"ref_67","unstructured":"Leal-Taix\u00e9, L., Taix\u00e9, T., Milan, A., Reid, I., Roth, S., and Schindler, K. (2015). MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking. arXiv."},{"key":"ref_68","unstructured":"Milan, A., Leal-Taix\u00e9, L., Taix\u00e9, T., Reid, I., Roth, S., and Schindler, K. (2016). MOT16: A Benchmark for Multi-Object Tracking. arXiv."},{"key":"ref_69","unstructured":"Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., Roth, S., Schindler, K., Leal-Taix\u00e9, L., and Taix\u00e9, T. (2019). CVPR19 Tracking and Detection Challenge: How crowded can it get?. arXiv."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"103448","DOI":"10.1016\/j.artint.2020.103448","article-title":"Multiple object tracking: A literature review","volume":"293","author":"Luo","year":"2021","journal-title":"Artif. Intell."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Dollar, P., Wojek, C., Schiele, B., and Perona, P. (2009, January 20\u201325). Pedestrian detection: A benchmark. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206631"},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/s00521-016-2319-3","article-title":"Online adaptive multiple pedestrian tracking in monocular surveillance video","volume":"28","author":"Wang","year":"2017","journal-title":"Neural Comput. Appl."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.patrec.2014.01.005","article-title":"Performance evaluation of crowd image analysis using the PETS2009 dataset","volume":"44","author":"Ferryman","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1109\/TPAMI.2018.2884990","article-title":"A Region-Based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking","volume":"41","author":"Tjaden","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1109\/TAES.2021.3061807","article-title":"Real-Time Navigation for Drogue-Type Autonomous Aerial Refueling Using Vision-Based Deep Learning Detection","volume":"57","author":"Garcia","year":"2021","journal-title":"IEEE Trans. Aerosp. Electron. Syst."},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"332","DOI":"10.1016\/j.actaastro.2018.01.029","article-title":"Fault-tolerant feature-based estimation of space debris rotational motion during active removal missions","volume":"146","author":"Biondi","year":"2018","journal-title":"Acta Astronaut."},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"8163","DOI":"10.1109\/TIE.2022.3217598","article-title":"Robust and Accurate Monocular Pose Tracking for Large Pose Shift","volume":"70","author":"Wang","year":"2023","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_78","first-page":"7437289","article-title":"Real-Time 3D Pedestrian Tracking with Monocular Camera","volume":"2022","author":"Xiao","year":"2022","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"913","DOI":"10.1007\/s11554-020-01054-y","article-title":"SmartSORT: An MLP-based method for tracking multiple objects in real-time","volume":"18","author":"Meneses","year":"2021","journal-title":"J. Real-Time Image Process."},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"7892","DOI":"10.1109\/JIOT.2020.2996609","article-title":"Multiplex Labeling Graph for Near-Online Tracking in Crowded Scenes","volume":"7","author":"Zhang","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"3852","DOI":"10.1109\/TIP.2013.2263146","article-title":"Monocular human motion tracking by using de-mc particle filter","volume":"22","author":"Du","year":"2013","journal-title":"IEEE Trans. Image Process."},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_83","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_85","first-page":"63","article-title":"Erosion and Dilation","volume":"2","author":"Soille","year":"2004","journal-title":"Morphol. Image Anal."},{"key":"ref_86","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/s11263-020-01359-2","article-title":"Image Matching from Handcrafted to Deep Features: A Survey","volume":"129","author":"Ma","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Geiger, A., Ziegler, J., and Stiller, C. (2011, January 5\u20139). StereoScan: Dense 3d reconstruction in real-time. Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV), Baden, Germany.","DOI":"10.1109\/IVS.2011.5940405"},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1115\/1.3662552","article-title":"A new approach to linear filtering and prediction problems","volume":"82","author":"Kalman","year":"1960","journal-title":"J. Fluids Eng. Trans. ASME"},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Steinbr\u00fccker, F., Sturm, J., and Cremers, D. (2011, January 6\u201313). Real-time visual odometry from dense RGB-D images. Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCVW.2011.6130321"},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.patrec.2015.10.014","article-title":"Extended fast compressive tracking with weighted multi-frame template matching for fast motion tracking","volume":"69","author":"Jenkins","year":"2016","journal-title":"Pattern Recognit. Lett."},{"key":"ref_91","unstructured":"Itseez (2024, February 27). Open Source Computer Vision Library. Available online: https:\/\/github.com\/itseez\/opencv."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1109\/TSMC.1979.4310076","article-title":"A Threshold Selection Method from Gray-Level Histograms","volume":"9","author":"Otsu","year":"1979","journal-title":"IEEE Trans. Syst. Man Cybern"},{"key":"ref_93","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1109\/TPAMI.1986.4767851","article-title":"A Computational Approach to Edge Detection","volume":"PAMI-8","author":"Canny","year":"1986","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1109\/TCSVT.2009.2020259","article-title":"Improved Low-Complexity Algorithm for 2-D Integer Lifting-Based Discrete Wavelet Transform Using Symmetric Mask-Based Scheme","volume":"19","author":"Hsia","year":"2009","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_95","unstructured":"Kanade, T., Kano, H., Kimura, S., Yoshida, A., and Oda, K. (1995, January 5\u20139). Development of a video-rate stereo machine. Proceedings of the 1995 IEEE\/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots, Pittsburgh, PA, USA."},{"key":"ref_96","first-page":"36","article-title":"White matter segmentation from MR images in subjects with brain tumours","volume":"Volume 7339 LNBI","author":"Szwarc","year":"2012","journal-title":"Information Technologies in Biomedicine, Proceedings of the Third International Conference, ITIB 2012, Gliwice, Poland, 11\u201313 June 2012"},{"key":"ref_97","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_98","doi-asserted-by":"crossref","unstructured":"Alcantarilla, P.F., Bartoli, A., and Davison, A.J. (2012, January 7\u201313). KAZE features. Proceedings of the Computer Vision\u2013ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy. Part VI 12.","DOI":"10.1007\/978-3-642-33783-3_16"},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_100","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask R-CNN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_101","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_102","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1109\/70.88147","article-title":"Mobile robot localization by tracking geometric beacons","volume":"7","author":"Leonard","year":"1991","journal-title":"IEEE Trans. Robot. Autom."},{"key":"ref_103","first-page":"21","article-title":"SSD: Single shot multibox detector","volume":"Volume 9905","author":"Liu","year":"2016","journal-title":"Computer Vision\u2013ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11\u201314 October 2016"},{"key":"ref_104","doi-asserted-by":"crossref","first-page":"1680","DOI":"10.3390\/make5040083","article-title":"A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS","volume":"5","author":"Terven","year":"2023","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_105","unstructured":"Jocher, G. (2023, October 01). YOLOv5 by Ultralytics. Available online: https:\/\/doi.org\/10.5281\/zenodo.3908559."},{"key":"ref_106","doi-asserted-by":"crossref","unstructured":"Shafiee, M.J., Chywl, B., Li, F., and Wong, A. (2017). Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video. arXiv.","DOI":"10.15353\/vsnl.v3i1.171"},{"key":"ref_107","doi-asserted-by":"crossref","unstructured":"Li, Z., and Snavely, N. (2018, January 18\u201323). MegaDepth: Learning Single-View Depth Prediction from Internet Photos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00218"},{"key":"ref_108","doi-asserted-by":"crossref","unstructured":"Wang, Q., Zhang, L., Bertinetto, L., Hu, W., and Torr, P.H. (2019, January 15\u201320). Fast Online Object Tracking and Segmentation: A Unifying Approach. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00142"},{"key":"ref_109","doi-asserted-by":"crossref","first-page":"1623","DOI":"10.1109\/TPAMI.2020.3019967","article-title":"Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer","volume":"44","author":"Ranftl","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_110","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_111","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1109\/TPAMI.2019.2929257","article-title":"OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields","volume":"43","author":"Cao","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_112","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1109\/PROC.1973.9030","article-title":"The viterbi algorithm","volume":"61","author":"Forney","year":"1973","journal-title":"Proc. IEEE"},{"key":"ref_113","first-page":"644","article-title":"RTM3D: Real-Time Monocular 3D Detection from Object Keypoints for Autonomous Driving","volume":"Volume 12348","author":"Li","year":"2020","journal-title":"Proceedings of the European Conference on Computer Vision"},{"key":"ref_114","first-page":"501","article-title":"Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline)","volume":"Volume 11208","author":"Sun","year":"2018","journal-title":"Proceedings of the European Conference on Computer Vision"},{"key":"ref_115","doi-asserted-by":"crossref","unstructured":"Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015, January 7\u201313). Scalable Person Re-identification: A Benchmark. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.133"},{"key":"ref_116","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1016\/S0031-3203(96)00104-5","article-title":"Template matching: Matched spatial filters and beyond","volume":"30","author":"Brunelli","year":"1997","journal-title":"Pattern Recognit."},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Wu, Y., Lim, J., and Yang, M.H. (2013, January 23\u201328). Online Object Tracking: A Benchmark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.312"},{"key":"ref_118","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1137\/0105003","article-title":"Algorithms for the Assignment and Transportation Problems","volume":"5","author":"Munkres","year":"1957","journal-title":"J. Soc. Ind. Appl. Math."},{"key":"ref_119","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/0004-3702(81)90024-2","article-title":"Determining optical flow","volume":"17","author":"Horn","year":"1981","journal-title":"Artif. Intell."},{"key":"ref_120","unstructured":"Hough, P.V. (1962). Method and Means for Recognizing Complex Patterns. (3,069,654), U.S. Patent."},{"key":"ref_121","unstructured":"Lucas, B.D., and Kanade, T. (1981, January 24\u201328). An Iterative Image Registration Technique with an Application to Stereo Vision. Proceedings of the 7th International Joint Conference on Artificial Intelligence\u2014Volume 2, San Francisco, CA, USA. IJCAI\u201981."},{"key":"ref_122","first-page":"3","article-title":"Detection and tracking of point","volume":"9","author":"Tomasi","year":"1991","journal-title":"Int. J. Comput. Vis."},{"key":"ref_123","doi-asserted-by":"crossref","unstructured":"Harris, C., and Stephens, M. (1988, January 15\u201317). A combined corner and edge detector. Proceedings of the Alvey Vision Conference, Manchester, UK.","DOI":"10.5244\/C.2.23"},{"key":"ref_124","doi-asserted-by":"crossref","unstructured":"Li, Q., Li, R., Ji, K., and Dai, W. (2015, January 1\u20133). Kalman Filter and Its Application. Proceedings of the 2015 8th International Conference on Intelligent Networks and Intelligent Systems (ICINIS), Tianjin, China.","DOI":"10.1109\/ICINIS.2015.35"},{"key":"ref_125","first-page":"329","article-title":"Scale-Space Filtering","volume":"Volume 2","author":"Witkin","year":"1987","journal-title":"Readings in Computer Vision"},{"key":"ref_126","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1109\/TSMC.1977.4309681","article-title":"Shape Discrimination Using Fourier Descriptors","volume":"7","author":"Persoon","year":"1977","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_127","unstructured":"Shi, J. (1994, January 21\u201323). Good features to track. Proceedings of the 1994 IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA."},{"key":"ref_128","unstructured":"Calonder, M., Lepetit, V., Strecha, C., and Fua, P. (2010). Computer Vision\u2014ECCV 2010, Proceedings of the 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5\u201311 September 2010, Springer. Lecture Notes in Computer Science."},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Mozhdehi, R.J., and Medeiros, H. (2017, January 17\u201320). Deep convolutional particle filter for visual tracking. Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296963"},{"key":"ref_130","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_131","first-page":"404","article-title":"SURF: Speeded Up Robust Features","volume":"Volume 3951","author":"Bay","year":"2006","journal-title":"Computer Vision\u2013ECCV 2006, Proceedings of the 9th European Conference on Computer Vision, Graz, Austria, 7\u201313 May 2006"},{"key":"ref_132","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011, January 6\u201313). ORB: An efficient alternative to SIFT or SURF. Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_133","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1109\/TPAMI.2008.275","article-title":"Faster and Better: A Machine Learning Approach to Corner Detection","volume":"32","author":"Rosten","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_134","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_135","doi-asserted-by":"crossref","unstructured":"Nam, H., and Han, B. (2016, January 27\u201330). Learning Multi-domain Convolutional Neural Networks for Visual Tracking. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.465"},{"key":"ref_136","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1002\/nav.20053","article-title":"The Hungarian method for the assignment problem","volume":"52","author":"Kuhn","year":"2005","journal-title":"Nav. Res. Logist."},{"key":"ref_137","unstructured":"Koch, G., Zemel, R., and Salakhutdinov, R. (2015, January 6\u201311). Siamese neural networks for one-shot image recognition. Proceedings of the ICML Deep Learning Workshop, Lille, France."},{"key":"ref_138","doi-asserted-by":"crossref","unstructured":"Gerstner, W., and Kistler, W.M. (2002). Spiking Neuron Models: Single Neurons, Populations, Plasticity, Cambridge University Press.","DOI":"10.1017\/CBO9780511815706"},{"key":"ref_139","doi-asserted-by":"crossref","unstructured":"Varga, D., Szir\u00e1nyi, T., Kiss, A., Sp\u00f3r\u00e1s, L., and Havasi, L. (2015, January 7\u201313). A Multi-View Pedestrian Tracking Method in an Uncalibrated Camera Network. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCVW.2015.33"},{"key":"ref_140","unstructured":"Koppanyi, Z., Toth, C., and Soltesz, T. (2017, January 12\u201316). Deriving Pedestrian Positions from Uncalibrated Videos. Proceedings of the ASPRS Imaging & Geospatial Technology Forum (IGTF), Tampa, FL, USA."},{"key":"ref_141","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1186\/s40537-022-00652-w","article-title":"Transfer learning: A friendly introduction","volume":"9","author":"Hosna","year":"2022","journal-title":"J. Big Data"},{"key":"ref_142","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_143","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Computer Vision\u2013ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6\u201312 September 2014, Springer. Part V 13."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/13\/6\/136\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,20]],"date-time":"2024-11-20T14:30:13Z","timestamp":1732113013000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/13\/6\/136"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,28]]},"references-count":143,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["computers13060136"],"URL":"https:\/\/doi.org\/10.3390\/computers13060136","relation":{},"ISSN":["2073-431X"],"issn-type":[{"type":"electronic","value":"2073-431X"}],"subject":[],"published":{"date-parts":[[2024,5,28]]}}}