Abstract
This paper deals with the finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors. For a given control model \(\mathcal {M}\) with denumerable states and compact Borel action sets, and possibly unbounded reward functions, under reasonable conditions we prove that there exists a sequence of control models \(\mathcal {M}_{n}\) such that the first passage optimal rewards and policies of \(\mathcal {M}_{n}\) converge to those of \(\mathcal {M}\), respectively. Based on the convergence theorems, we propose a finite-state and finite-action truncation method for the given control model \(\mathcal {M}\), and show that the first passage optimal reward and policies of \(\mathcal {M}\) can be approximated by those of the solvable truncated finite control models. Finally, we give the corresponding value and policy iteration algorithms to solve the finite approximation models.
Similar content being viewed by others
References
Altman E (1994) Denumerable constrainedMarkov decision processes and finite approximations.MathMeth Operat Res 19:169–191
Alvarez-Mena J, Hernández-Lerma O (2002) Convergence of the optimal values of constrained Markov control processes. Math Meth Oper Res 55:461–484
Bäuerle N, Rieder U (2011) Markov decision processes with applications in finance. Springer, Heidelberg
Derman C (1970) Finite State Markovian Decision Processes, vol. 67 of Mathematics in Science and Engineering. Academic Press, New York
González-Hernández J, López-Martínez RR, Minjárez-Sosa JA(2009) Approximation,estimation, and control of stochastic systems under a randomized discounted cost criterion. Kybernetika 45:737–754
Guo XP, Hernández-del-Valle A, Hernández-Lerma O (2012) First passage problems for nonstationary discrete-time stochastic control systems. Eur J Control 18:528–538
Guo XP, Zhang WZ (2014) Convergence of controlled models and finite-state approximation for discounted continuous-timeMarkov decision processes with constraints. Eur J Oper Res 238:486–496
Guo XY, Song XP, Zhang Y (2014) First passage criteria for continuous-time Markov decision processes with varying discount factors and history-dependent policies. IEEE Trans Automat Control 59:163–174
Hernández-Lerma O, Lasserre JB (1996) Discrete-Time Markov Control Processes. Springer-Verlag, New York
Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-timeMarkov control processes. Springer-Verlag, New York
Hinderer K, Waldmann K-H (2005) Algorithms for countable state Markov decision models with an absorbing set. SIAM J Control Optim 43:2109–2131
Huang YH, Guo XP (2011) First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs. Acta Math Appl Sinica 27:177–190
Huang YH, Wei QD, Guo XP (2013) Constrained Markov decision processes with first passage criteria. Ann Oper Res 206:197–C219
Kitaev MY, Rykov VV (1995) Controlled Queueing Systems. CRC Press
Liu JY, Huang SM (2001) Markov decision processes with distribution function criterion of first-passage time. Appl Math Optim 43:187–201
Liu JY, Liu K (1992) Markov decision programming-the first passage model with denumerable state space. Systems Sci Math Sci 5:340–351
Meyn SP, Tweedie RL (2009) Markov Chains and Stochastic Stability. Cambridge University Press, New York
Prieto-Rumeau T, Hernández-Lerma O (2012) Discounted continuous-time controlled Markov chains: convergence of control models. J Appl Prob 49:1072–1090
Puterman ML (1994) Markov Decision Processes. Wiley, New York
Schäl M (2005) Control of ruin probabilities by discrete-time investments. Math Methods Oper Res 62:141–158
Schmidli H (2008) Stochastic Contol in Insurance. Springer-Verlag, London
Wei QD, Guo XP (2011) Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper Res Lett 39:369–374
Wu X, Guo XP (2015) First passage optimality and variance minimization of Markov decision processes with varying discount factors. J Appl Prob 52, no.2 (to be publishes)
Yu SX, Lin YL, Yan PF (1998) Optimization models for the first arrival target distribution function in discrete time. J Math Anal Appl 225:193–223
Acknowledgments
Research of the first coauthor was partially supported by National Natural Science Foundation of China (NSFC no. 41271076). Research of the second coauthor was partially supported by NSFC (no. 61004037), Research Fund for the Doctoral Program of Higher Education of China, the Fundamental Research Funds for the Central Universities, and by Guangdong Province Key Laboratory of Computational Science.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, X., Zhang, J. Finite approximation of the first passage models for discrete-time Markov decision processes with varying discount factors. Discrete Event Dyn Syst 26, 669–683 (2016). https://doi.org/10.1007/s10626-014-0209-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10626-014-0209-3