Abstract
The dynamicity of available resources and network conditions, such as channel capacity and traffic characteristics, have posed major challenges to scheduling in wireless networks. Reinforcement learning (RL) enables wireless nodes to observe their respective operating environment, learn, and make optimal or near-optimal scheduling decisions. Learning, which is the main intrinsic characteristic of RL, enables wireless nodes to adapt to most forms of dynamicity in the operating environment as time goes by. This paper presents an extensive review on the application of the traditional and enhanced RL approaches to various types of scheduling schemes, namely packet, sleep-wake and task schedulers, in wireless networks, as well as the advantages and performance enhancements brought about by RL. Additionally, it presents how various challenges associated with scheduling schemes have been approached using RL. Finally, we discuss various open issues related to RL-based scheduling schemes in wireless networks in order to explore new research directions in this area. Discussions in this paper are presented in a tutorial manner in order to establish a foundation for further research in this field.
Similar content being viewed by others
References
Sutton R S, Barto A G. Reinforcement learning: an introduction. US: MIT Press, 1998
Stidham S J. Applied probability in operations research: a retrospective//Preprint: analysis, design, and control of queueing systems. Operation Research, 2002, 50(1): 197–216
Thompson M S, Mackenzie A B, Dasilva L A, Hadjichristofi G. A mobile ad hoc networking competition: a retrospective look at the MANIAC challenge. IEEE Communications Magazine, 2012, 50(7): 121–127
Li X, Falcon R, Nayak A, Stojmenovic I. Servicing wireless sensor networks by mobile robots. IEEE Communications Magazine, 2012, 50(7): 147–154
Xue Y, Lin Y, Cai H, Chi C. Autonomic joint session scheduling strategies for heterogeneous wireless networks. In: Proceedings of the 2008 IEEE Wireless Communications and Networking Conference. 2008, 2045–2050
Song M, Xin C, Zhao Y, Cheng X. Dynamic spectrum access: from cognitive radio to network radio. IEEE Wireless Communications, 2012, 19(1): 23–29
Mao J, Xiang F, Lai H. RL-based superframe order adaptation algorithm for IEEE 802.15.4 networks. In: Proceedings of the 2009 Chinese Control and Decision Conference. 2009, 4708–4711
Shah K, Kumar M. Distributed independent reinforcement learning (DIRL) approach to resource management in wireless sensor networks. In: Proceedings of the 4th International Conference on Mobile Ad-hoc and Sensor Systems. 2007, 1–9
Niu J. Self-learning scheduling approach for wireless sensor network. In: Proceedings of the 2010 International Conference on Future Computer and Communication. 2010, 253–257
Kaelbling L P, Littman M L, Wang X. Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 1996, 4: 237–285
Bourenane M. Adaptive scheduling in mobile ad hoc networks using reinforcement learning approach. In: Proceedings of the 9th International Conference on Innovations in Information Technology. 2011, 392–397
Felice M D, Chowdhury K R, Kassler A, Bononi L. Adaptive sensing scheduling and spectrum selection in cognitive wireless mesh networks. In: Proceedings of the 2011 International Conference on Computer Communication Networks. 2011, 1–6
Zouaidi S, Mellouk A, Bourennane M, Hoceini S. Design and performance analysis of inductive QoS scheduling for dynamic network routing. In: Proceedings of the 20th Conference on Software, Telecomm, Computer Networks. 2008, 140–146
Sallent O, Pérez-Romero J, Sánchez-González J, Agustí R, Díazguerra MA, Henche D, Paul D. A roadmap from UMTS optimization to LTE self-optimization. IEEE Communications Magazine, 2011, 49(6): 172–182
Bobarshad H, van der Schaar M, Aghvami A H, Dilmaghani R S, Shikh-Bahaei M R. Analytical modeling for delay-sensitive video over WLAN. IEEE Transactions on Multimedia, 2012, 14(2): 401–414
Liu Z, Elhanany I. RL-MAC: a QoS-aware reinforcement learning based MAC protocol for wireless sensor networks. In: Proceedings of the 2006 Conference on Networking, Sensing and Control. 2006, 768–773
Yu R, Sun Z, Mei S. Packet scheduling in broadband wireless networks using neuro-dynamic programming. In: Proceedings of the 65th IEEE Vehicular Technology Conference. 2007, 2276–2780
Khan M I, Rinner B. Resource coordination in wireless sensor net works by cooperative reinforcement learning. In: Proceedings of the 2012 IEEE International Conference on Pervasive Computing and Communications. 2012, 895–900
Kok J R, Vlassis N. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research, 2006, 7: 1789–1828
Schneider J, Wong W-K, Moore A, Riedmiller M, Distributed value functions. In: Proceedings of the 16th Conference on Machine Learning. 1999, 371–378
Sahoo A, Manjunath D. Revisiting WFQ: minimum packet lengths tighten delay and fairness bounds. IEEE Communications Letters, 2007, 11(4): 366–368
Yu H, Ding L, Liu N, Pan Z, Wu P, You X. Enhanced first-in-first-outbased round-robin multicast scheduling algorithm for input-queued switches. IET Communications, 2011, 5(8): 1163–1171
Yau K L A, Komisarczuk P, Teal P D. Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. In: Proceedings of the 2010 Conference on Local Computer Networks. 2010, 152–159
Engineering Systems Division (ESD). ESD Symposium Committee Overview. In: Proceedings of Massachusetts Institute of Technology ESD Internal Symposium. 2002. http://esd.mit.edu/WPS
Ouzecki D, Jevtic D. Reinforcement learning as adaptive network routing of mobile agents. In: Proceedings of the 33rd International Conference on Information and Communication Technology. Electronics and Microelectronics. 2010, 479–484
Bhorkar A A, Naghshvar M, Javidi T, Rao B D. Adaptive opportunistic routing for wireless ad hoc networks. IEEE/ACM Transactions on Network, 2012, 20(1): 243–256
Lin Z, Schaar M V D. Autonomic and distributed joint routing and power control for delay-sensitive applications in multi-hop wireless networks. IEEE Transactions on Wireless Communications, 2011, 10(1): 102–113
Santhi G, Nachiappan A, Ibrahime M Z, Raghunadhane R, Favas M K. Q-learning based adaptive QoS routing protocol for MANETs. In: Proceedings of the 2011 International Conference on Recent Trends in Information Technology. 2011, 1233–1238
Author information
Authors and Affiliations
Corresponding author
Additional information
Kok-Lim Alvin Yau has a BE from Universiti Teknologi Petronas, Malaysia, a MSc from National University of Singapore, and a PhD from Victoria University, New Zealand. His research interests are wireless networks and applied artificial intelligence.
Kae Hsiang Kwong has a BE from Jinan University, China and a PhD from University of Strathclyde, UK. His research interests include network infrastructure design, monitoring, and performance optimization.
Chong Shen has a BE from Wuhan University, China, a MPhil from University of Strathclyde, UK and a PhD from Cork Institute of Technology, Ireland. His research interests cover layers 2 and 3 algorithm and protocol.
Rights and permissions
About this article
Cite this article
Yau, KL.A., Kwong, K.H. & Shen, C. Reinforcement learning models for scheduling in wireless networks. Front. Comput. Sci. 7, 754–766 (2013). https://doi.org/10.1007/s11704-013-2291-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-013-2291-3