Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method of artificial intelligence that is much closer to human learning. As one of the most basic algorithms for reinforcement learning, Q-learning is a discrete strategic learning algorithm that uses a reasonable strategy to generate an action. According to the rewards and the next state generated by the interaction of the action and the environment, optimal Q-function can be obtained. Furthermore, based on Q-learning and convolutional neural networks, the deep Q-learning with experience replay is developed in this paper. To ensure the convergence of value function, a discount factor is involved in the value function. The temporal difference method is introduced to training the Q-function or value function. At last, a detailed procedure is proposed to implement deep reinforcement learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kalchbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukcuoglu, K., Graepel, T., Hassabis, D.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Werbos, P.J.: Approximate dynamic programming for realtime control and neural modeling. In: Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, New York (1992)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Liu, D.R., Wang, D., Wang, F.Y., Li, H.L., Yang, X.: Neural-network-based Online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans. Cybern. 44(12), 2834–2847 (2014)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Liu, D.R., Wang, D., Li, H.: Decentralized stabilization for a class of continuous-time nonlinear interconnected systems using online learning optimal control approach. IEEE Trans. Neural Netw. Learn. Syst. 25(2), 418–428 (2014)
Wei, Q.L., Liu, D.R., Lin, H.: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans. Cybern. 46(3), 840–853 (2015)
Liu, D.R., Wei, Q.L.: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)
Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning - a new Frontier in artifical intelligence research. IEEE Comput. Intell. Mag. 5(4), 13–18 (2010)
Watkins, C.J.H., Dayan, P.: Technical note: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. In: NIPS Deep Learning Workshop, arxiv preprint arXiv:1312.5602 (2013)
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant 61673117.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Tan, F., Yan, P., Guan, X. (2017). Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10637. Springer, Cham. https://doi.org/10.1007/978-3-319-70093-9_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-70093-9_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70092-2
Online ISBN: 978-3-319-70093-9
eBook Packages: Computer ScienceComputer Science (R0)