{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,20]],"date-time":"2024-09-20T17:00:28Z","timestamp":1726851628732},"reference-count":33,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2023,4,27]],"date-time":"2023-04-27T00:00:00Z","timestamp":1682553600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"Several approaches have applied Deep Reinforcement Learning (DRL) to Unmanned Aerial Vehicles (UAVs) to do autonomous object tracking. These methods, however, are resource intensive and require prior knowledge of the environment, making them difficult to use in real-world applications. In this paper, we propose a Lightweight Deep Vision Reinforcement Learning (LDVRL) framework for dynamic object tracking that uses the camera as the only input source. Our framework employs several techniques such as stacks of frames, segmentation maps from the simulation, and depth images to reduce the overall computational cost. We conducted the experiment with a non-sparse Deep Q-Network (DQN) (value-based) and a Deep Deterministic Policy Gradient (DDPG) (actor-critic) to test the adaptability of our framework with different methods and identify which DRL method is the most suitable for this task. In the end, a DQN is chosen for several reasons. Firstly, a DQN has fewer networks than a DDPG, hence reducing the computational resources on physical UAVs. Secondly, it is surprising that although a DQN is smaller in model size than a DDPG, it still performs better in this specific task. Finally, a DQN is very practical for this task due to the ability to operate in continuous state space. Using a high-fidelity simulation environment, our proposed approach is verified to be effective.<\/jats:p>","DOI":"10.3390\/a16050227","type":"journal-article","created":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T05:33:40Z","timestamp":1682660020000},"page":"227","source":"Crossref","is-referenced-by-count":5,"title":["UAV Dynamic Object Tracking with Lightweight Deep Vision Reinforcement Learning"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"http:\/\/orcid.org\/0009-0008-7092-3106","authenticated-orcid":false,"given":"Hy","family":"Nguyen","sequence":"first","affiliation":[{"name":"Applied Artificial Intelligence Institute (A<\/i>2<\/sup>I<\/i>2<\/sup>), Deakin University, Geelong, VIC 3216, Australia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-7848-9008","authenticated-orcid":false,"given":"Srikanth","family":"Thudumu","sequence":"additional","affiliation":[{"name":"Applied Artificial Intelligence Institute (A<\/i>2<\/sup>I<\/i>2<\/sup>), Deakin University, Geelong, VIC 3216, Australia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-1415-5786","authenticated-orcid":false,"given":"Hung","family":"Du","sequence":"additional","affiliation":[{"name":"Applied Artificial Intelligence Institute (A<\/i>2<\/sup>I<\/i>2<\/sup>), Deakin University, Geelong, VIC 3216, Australia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4447-5166","authenticated-orcid":false,"given":"Kon","family":"Mouzakis","sequence":"additional","affiliation":[{"name":"Applied Artificial Intelligence Institute (A<\/i>2<\/sup>I<\/i>2<\/sup>), Deakin University, Geelong, VIC 3216, Australia"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4805-1467","authenticated-orcid":false,"given":"Rajesh","family":"Vasa","sequence":"additional","affiliation":[{"name":"Applied Artificial Intelligence Institute (A<\/i>2<\/sup>I<\/i>2<\/sup>), Deakin University, Geelong, VIC 3216, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2023,4,27]]},"reference":[{"key":"ref_1","first-page":"5578490","article-title":"A dual-mode medium access control mechanism for UAV-enabled intelligent transportation system","volume":"2021","author":"Khan","year":"2021","journal-title":"Mob. Inf. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"719","DOI":"10.1007\/s10846-019-01045-7","article-title":"Zoning a service area of unmanned aerial vehicles for package delivery services","volume":"97","author":"Sung","year":"2020","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1109\/LWC.2018.2880467","article-title":"Resource allocation in UAV-assisted M2M communications for disaster rescue","volume":"8","author":"Liu","year":"2018","journal-title":"IEEE Wirel. Commun. Lett."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, Y., Su, Z., Xu, Q., Li, R., and Luan, T.H. (2021, January 10\u201313). Lifesaving with RescueChain: Energy-efficient and partition-tolerant blockchain based secure information sharing for UAV-aided disaster rescue. Proceedings of the IEEE Conference on Computer Communications (IEEE INFOCOM 2021), Vancouver, BC, Canada.","DOI":"10.1109\/INFOCOM42981.2021.9488719"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1007\/s10846-021-01462-7","article-title":"Maturity levels of public safety applications using unmanned aerial systems: A review","volume":"103","author":"Stampa","year":"2021","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Nikolic, J., Burri, M., Rehder, J., Leutenegger, S., Huerzeler, C., and Siegwart, R. (2013, January 2\u20139). A UAV system for inspection of industrial facilities. Proceedings of the 2013 IEEE Aerospace Conference, Big Sky, MT, USA.","DOI":"10.1109\/AERO.2013.6496959"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lebedev, I., Ianin, A., Usina, E., and Shulyak, V. (2021, January 15\u201318). Construction of land base station for UAV maintenance automation. Proceedings of the 15th International Conference on Electromechanics and Robotics \u201cZavalishin\u2019s Readings\u201d, Ufa, Russia.","DOI":"10.1007\/978-981-15-5580-0_41"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Muhammad, A., Shahpurwala, A., Mukhopadhyay, S., and El-Hag, A.H. (2019, January 20\u201322). Autonomous drone-based powerline insulator inspection via deep learning. Proceedings of the Iberian Robotics Conference, Porto, Portugal.","DOI":"10.1007\/978-3-030-35990-4_5"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2021, January 20\u201325). Scaled-yolov4: Scaling cross stage partial network. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Trujillo, J.C., Munguia, R., Urzua, S., and Grau, A. (2020). Cooperative Visual-SLAM System for UAV-Based Target Tracking in GPS-Denied Environments: A Target-Centric Approach. Electronics, 9.","DOI":"10.3390\/electronics9050813"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1007\/BF00992698","article-title":"Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shin, S.Y., Kang, Y.W., and Kim, Y.G. (2019, January 23\u201326). Automatic drone navigation in realistic 3d landscapes using deep reinforcement learning. Proceedings of the 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), Paris, France.","DOI":"10.1109\/CoDIT.2019.8820322"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bhagat, S., and Sujit, P. (2020, January 9\u201312). UAV target tracking in urban environments using deep reinforcement learning. Proceedings of the 2020 International Conference on Unmanned Aircraft Systems (ICUAS), Athens, Greece.","DOI":"10.1109\/ICUAS48674.2020.9213856"},{"key":"ref_15","unstructured":"Sutton, R.S., McAllester, D., Singh, S., and Mansour, Y. (December, January 29). Policy gradient methods for reinforcement learning with function approximation. Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_16","unstructured":"Konda, V., and Tsitsiklis, J. (December, January 29). Actor-critic algorithms. Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhaowei, M., Yifeng, N., and Lincheng, S. (2016, January 12\u201315). Vision-based behavior for UAV reactive avoidance by using a reinforcement learning method. Proceedings of the 2016 12th World Congress on Intelligent Control and Automation (WCICA), Guilin, China.","DOI":"10.1109\/WCICA.2016.7578765"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., and Farhadi, A. (June, January 29). Target-driven visual navigation in indoor scenes using deep reinforcement learning. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989381"},{"key":"ref_19","unstructured":"Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Al-Qubaydhi, N., Alenezi, A., Alanazi, T., Senyor, A., Alanezi, N., Alotaibi, B., Alotaibi, M., Razaque, A., Abdelhamid, A.A., and Alotaibi, A. (2022). Detection of Unauthorized Unmanned Aerial Vehicles Using YOLOv5 and Transfer Learning. Electronics, 11.","DOI":"10.20944\/preprints202202.0185.v1"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, B., and Luo, H. (2022). An Improved Yolov5 for Multi-Rotor UAV Detection. Electronics, 11.","DOI":"10.3390\/electronics11152330"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"107261","DOI":"10.1016\/j.compeleceng.2021.107261","article-title":"YOLOv4_Drone: UAV image target detection based on an improved YOLOv4 algorithm","volume":"93","author":"Tan","year":"2021","journal-title":"Comput. Electr. Eng."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wu, W., Liu, H., Li, L., Long, Y., Wang, X., Wang, Z., Li, J., and Chang, Y. (2021). Application of local fully Convolutional Neural Network combined with YOLO v5 algorithm in small target detection of remote sensing image. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0259283"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Shah, S., Dey, D., Lovett, C., and Kapoor, A. (2017, January 12\u201315). Airsim: High-fidelity visual and physical simulation for autonomous vehicles. Proceedings of the Field and Service Robotics, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-67361-5_40"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_26","unstructured":"Singhal, G., Bansod, B., and Mathew, L. (2022, September 10). Unmanned Aerial Vehicle Classification, Applications and Challenges: A Review. Available online: https:\/\/www.preprints.org\/manuscript\/201811.0601\/v1."},{"key":"ref_27","first-page":"679","article-title":"A Markovian decision process","volume":"6","author":"Bellman","year":"1957","journal-title":"J. Math. Mech."},{"key":"ref_28","unstructured":"Jaakkola, T., Singh, S., and Jordan, M. (December, January 28). Reinforcement learning algorithm for partially observable Markov decision problems. Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_29","unstructured":"Lin, L. (1992). Reinforcement Learning for Robots Using Neural Networks, Carnegie Mellon University."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_31","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_32","unstructured":"Fujimoto, S., Hoof, H., and Meger, D. (2018, January 10\u201315). Addressing function approximation error in actor-critic methods. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_33","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/5\/227\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,29]],"date-time":"2023-04-29T04:31:13Z","timestamp":1682742673000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/5\/227"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,27]]},"references-count":33,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["a16050227"],"URL":"https:\/\/doi.org\/10.3390\/a16050227","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,27]]}}}