{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T10:34:00Z","timestamp":1725273240763},"reference-count":36,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T00:00:00Z","timestamp":1714089600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Science and Technology Foundation of Guangdong Province","award":["2021A0101180005"]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"Delay-sensitive task offloading in a device-to-device assisted mobile edge computing (D2D-MEC) system with energy harvesting devices is a critical challenge due to the dynamic load level at edge nodes and the variability in harvested energy. In this paper, we propose a joint dynamic task offloading and CPU frequency control scheme for delay-sensitive tasks in a D2D-MEC system, taking into account the intricacies of multi-slot tasks, characterized by diverse processing speeds and data transmission rates. Our methodology involves meticulous modeling of task arrival and service processes using queuing systems, coupled with the strategic utilization of D2D communication to alleviate edge server load and prevent network congestion effectively. Central to our solution is the formulation of average task delay optimization as a challenging nonlinear integer programming problem, requiring intelligent decision making regarding task offloading for each generated task at active mobile devices and CPU frequency adjustments at discrete time slots. To navigate the intricate landscape of the extensive discrete action space, we design an efficient multi-agent DRL learning algorithm named MAOC, which is based on MAPPO, to minimize the average task delay by dynamically determining task-offloading decisions and CPU frequencies. MAOC operates within a centralized training with decentralized execution (CTDE) framework, empowering individual mobile devices to make decisions autonomously based on their unique system states. Experimental results demonstrate its swift convergence and operational efficiency, and it outperforms other baseline algorithms.<\/jats:p>","DOI":"10.3390\/s24092779","type":"journal-article","created":{"date-parts":[[2024,4,26]],"date-time":"2024-04-26T14:56:32Z","timestamp":1714143392000},"page":"2779","source":"Crossref","is-referenced-by-count":3,"title":["A Multi-Agent RL Algorithm for Dynamic Task Offloading in D2D-MEC Network with Energy Harvesting"],"prefix":"10.3390","volume":"24","author":[{"given":"Xin","family":"Mi","sequence":"first","affiliation":[{"name":"School of Computer, Zhongshan Institute, University of Electronic Science and Technology of China, Zhognshan 528400, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4437-4277","authenticated-orcid":false,"given":"Huaiwen","family":"He","sequence":"additional","affiliation":[{"name":"School of Computer, Zhongshan Institute, University of Electronic Science and Technology of China, Zhognshan 528400, China"}]},{"given":"Hong","family":"Shen","sequence":"additional","affiliation":[{"name":"Engineering and Technology, Central Queensland University, Brisbane 4000, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2024,4,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"108900","DOI":"10.1016\/j.comnet.2022.108900","article-title":"Joint computing, communication and cost-aware task offloading in d2d-enabled het-mec","volume":"209","author":"Abbas","year":"2022","journal-title":"Comput. Netw."},{"key":"ref_2","first-page":"6599","article-title":"Multi-objective parallel task offloading and content caching in d2d-aided mec networks","volume":"22","author":"Xiao","year":"2022","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"7916","DOI":"10.1109\/TVT.2020.2993849","article-title":"Deep reinforcement learning-based adaptive computation offloading for mec in heterogeneous vehicular networks","volume":"69","author":"Ke","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1109\/MCOM.2019.1800608","article-title":"Intelligent offloading in multi-access edge computing: A state-of-the-art review and framework","volume":"57","author":"Cao","year":"2019","journal-title":"IEEE Commun. Mag."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Mi, X., and He, H. (2023, January 19). Multi-agent deep reinforcement learning for d2d-assisted mec system with energy harvesting. Proceedings of the 2023 25th International Conference on Advanced Communication Technology (ICACT), Pyeongchang, South Korea.","DOI":"10.23919\/ICACT56868.2023.10079275"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"103657","DOI":"10.1016\/j.jnca.2023.103657","article-title":"A survey on essential challenges in relay-aided d2d communication for next-generation cellular networks","volume":"216","author":"Salim","year":"2023","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1109\/TNET.2023.3288558","article-title":"Mean field graph based d2d collaboration and offloading pricing in mobile edge computing","volume":"32","author":"Wang","year":"2023","journal-title":"IEEE\/ACM Trans. Netw."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2117","DOI":"10.1109\/TII.2022.3206787","article-title":"Lyapunov-guided delay-aware energy efficient offloading in iiot-mec systems","volume":"19","author":"Wu","year":"2022","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Wang, H., Lin, Z., and Lv, T. (2021, January 29). Energy and delay minimization of partial computing offloading for d2d-assisted mec systems. Proceedings of the 2021 IEEE Wireless Communications and Networking Conference (WCNC), Nanjing, China.","DOI":"10.1109\/WCNC49053.2021.9417536"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"17508","DOI":"10.1109\/JIOT.2021.3081694","article-title":"A drl agent for jointly optimizing computation offloading and resource allocation in mec","volume":"8","author":"Chen","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1109\/TCC.2022.3140231","article-title":"Deep reinforcement learning-based joint optimization of delay and privacy in multiple-user mec systems","volume":"11","author":"Zhao","year":"2022","journal-title":"IEEE Trans. Cloud Comput."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3544836","article-title":"Scheduling iot applications in edge and fog computing environments: A taxonomy and future directions","volume":"55","author":"Goudarzi","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1985","DOI":"10.1109\/TMC.2020.3036871","article-title":"Deep reinforcement learning for task offloading in mobile edge computing systems","volume":"21","author":"Tang","year":"2020","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"8050","DOI":"10.1109\/TVT.2019.2924015","article-title":"Online deep reinforcement learning for computation offloading in blockchain-empowered mobile edge computing","volume":"68","author":"Qiu","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Huang, H., Ye, Q., and Du, H. (2020, January 7). Reinforcement learning based offloading for realtime applications in mobile edge computing. Proceedings of the ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland.","DOI":"10.1109\/ICC40277.2020.9148748"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Li, J., Gao, H., Lv, T., and Lu, Y. (2018, January 15). Deep reinforcement learning based computation offloading and resource allocation for mec. Proceedings of the 2018 IEEE Wireless Communications and Networking Conference (WCNC), Barcelona, Spain.","DOI":"10.1109\/WCNC.2018.8377343"},{"key":"ref_17","unstructured":"Sunehag, P., Lever, G., Gruslys, A., Czarnecki, W.M., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., and Tuyls, K. (2017). Value-decomposition networks for cooperative multi-agent learning. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"108678","DOI":"10.1016\/j.comnet.2021.108678","article-title":"Energy harvesting computation offloading game towards minimizing delay for mobile edge computing","volume":"204","author":"Guo","year":"2022","journal-title":"Comput. Netw."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"4553","DOI":"10.1109\/TITS.2022.3178896","article-title":"Multi-irs and multi-uav-assisted mec system for 5g\/6g networks: Efficient joint trajectory optimization and passive beamforming framework","volume":"24","author":"Asim","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"4110","DOI":"10.1109\/JSYST.2019.2921115","article-title":"Task execution cost minimization-based joint computation offloading and resource allocation for cellular d2d mec systems","volume":"13","author":"Chai","year":"2019","journal-title":"IEEE Syst. J."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1016\/j.neucom.2019.11.081","article-title":"Joint offloading and scheduling decisions for dag applications in mobile edge computing","volume":"424","author":"Liang","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"6743","DOI":"10.1109\/TWC.2021.3076201","article-title":"Online distributed offloading and computing resource management with energy harvesting for heterogeneous mec-enabled iot","volume":"20","author":"Xia","year":"2021","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"4005","DOI":"10.1109\/JIOT.2018.2876279","article-title":"Optimized computation offloading performance in virtual edge computing systems via deep reinforcement learning","volume":"6","author":"Chen","year":"2018","journal-title":"IEEE Internet Things J."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.dcan.2018.10.003","article-title":"Deep reinforcement learning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing","volume":"5","author":"Huang","year":"2019","journal-title":"Digit. Commun. Networks"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2581","DOI":"10.1109\/TMC.2019.2928811","article-title":"Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks","volume":"19","author":"Huang","year":"2019","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hao, H., Xu, C., Zhang, W., Yang, S., and Muntean, G.-M. (IEEE Trans. Mob. Comput., 2024). Joint task offloading, resource allocation, and trajectory design for multi-uav cooperative edge computing with task priority, IEEE Trans. Mob. Comput., in press.","DOI":"10.1109\/TMC.2024.3350078"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1109\/TII.2022.3158974","article-title":"Task co-offloading for d2d-assisted mobile edge computing in industrial internet of things","volume":"19","author":"Dai","year":"2022","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"8005","DOI":"10.1109\/JIOT.2020.3041673","article-title":"Resource management for computation offloading in d2d-aided wireless powered mobile-edge computing networks","volume":"8","author":"Sun","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"12206","DOI":"10.1109\/TVT.2022.3192345","article-title":"Latency minimization for mmwave d2d mobile edge computing systems: Joint task allocation and hybrid beamforming design","volume":"71","author":"Liu","year":"2022","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2023","DOI":"10.1007\/s11276-021-02554-w","article-title":"Joint computation offloading and task caching for multi-user and multi-task mec systems: Reinforcement learning-based algorithms","volume":"27","author":"Elgendy","year":"2021","journal-title":"Wirel. Netw."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"9303","DOI":"10.1109\/JIOT.2020.3000527","article-title":"Dynamic computation offloading with energy harvesting devices: A hybrid-decision-based deep reinforcement learning approach","volume":"7","author":"Zhang","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_32","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_33","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 3). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Proceedings of the International conference on machine learning, Stockholm, Sweden."},{"key":"ref_34","unstructured":"Kakade, S., and Langford, J. (2002, January 8). Approximately optimal approximate reinforcement learning. Proceedings of the The 19th International Conference on Machine Learning, Sydney, Australia."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Li, G., Chen, M., Wei, X., Qi, T., and Zhuang, W. (2020, January 15). Computation offloading with reinforcement learning in d2d-mec network. Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus.","DOI":"10.1109\/IWCMC48107.2020.9148285"},{"key":"ref_36","unstructured":"Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., and Wu, Y. (2021). The surprising effectiveness of ppo in cooperative, multi-agent games. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/9\/2779\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,30]],"date-time":"2024-04-30T08:14:23Z","timestamp":1714464863000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/9\/2779"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,26]]},"references-count":36,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["s24092779"],"URL":"https:\/\/doi.org\/10.3390\/s24092779","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,26]]}}}