Abstract
In this paper, we study constrained continuous-time Markov decision processes with a denumerable state space and unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is imposed on an expected average cost. We give suitable conditions that ensure the existence of a constrained-optimal policy. Moreover, we show that the constrained-optimal policy randomizes between two stationary policies differing in at most one state. Finally, we use a controlled queueing system to illustrate our conditions.
Similar content being viewed by others
References
Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton
Alvarez-Mena J, Hernández-Lerma O (2002) Convergence of the optimal values of constrained Markov control processes. Math Meth Oper Res 55(3):461–484
Anderson WJ (1991) Continuous-time Markov chains. Springer, Heidelberg
Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. J Math Anal Appl 112:236–252
Cao XR (1998) The relations among potentials,perturbation analysis, and Markov decision processes. Discrete Event Dyna Syst Theor Applications 8(1):71–87
Chung KL (1960) Markov Chains with stationary transition probabilities. Springer, Heidelberg
Feinberg EA (2000) Constrained discounted Markov decition processes and hamiltonian cycles. Math Oper Res 25(1):130–140
Guo XP (2000) Constrained nonhomogeneous Markov decision processes with expected total reward criterion. Acta Appl Math Sin (English Ser) 23:230–235
Guo XP, Cao XR (2005) Optimal control of ergodic continous-time Markov chains with average sample-path rewards. SIAM J Control Optim 44(1):29–48
Guo XP, Hernández-Lerma O (2003a) Continuous-time controlled Markov chains with discounted rewards. Acta Appl Math 79:195–216
Guo XP, Hernández-Lerma O (2003b) Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans Automatic Control 48(2):236–245
Guo XP, Hernández-Lerma O (2003c) Constrained continuous-time Markov control processes with discounted criteria. Stoch Anal Appl 21(2):379–399
Guo XP, Liu K (2001) A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans Automatic Control 46:1984–1988
Guo XP, Zhu WP (2002a) Denumerable state continuous-time Markov decision processes with unbounded cost and transition rates under average criterion. ANZIAM J 43:541–551
Guo XP, Zhu WP (2002b) In: Hou ZT, Filar JA, Chan AY, Eds Dordrecht (eds) Optimality conditions for continuous-time Markov decision processes with average cost criterion, in Markov processes and controlled Markv chains. Kluwer, The Netherlands
Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, Heidelberg
Hernández-Lerma O (1994) Lectures on continuous-time Markov control processes, vol. 3. Sociedad Matematica Mexicana, Mexico
Hernández-Lerma O, Govindan TE (2001) Nonstationary continuous-time Markov control processes with discounted costs on infinite horizon. Acta Appl Math 67:277–293
Hernández-Lerma O, González-Hernández J (2000) Constrained Markov control processes in Borel spaces: the discounted case. Math Meth Oper Res 52(2):271–285
Horiguchi M (2001) Markov decision processes with a stopping time constraint. Math Meth Oper Res 53:279–295
Hou ZT, Zou JZ, Zhang HJ, Liu ZM, Xiao GN, Chen AY, Fei LZ (1994) The Q-matrix problems for Markov processes. Science and Technology Press of Hunan, Changsha
Kakumanu P (1972) Non-discounted continuous-time Markov decision processes with countable state space. SIAM J Control 10:210–220
Kakumanu P (1975) Continuous-time Markov decision processes with average return criterion. J Math Anal Appl 52:173–188
Lewis ME, Puterman ML (2000) A note on bias optimality in controlled queueing systems. J Appl Prob 37:300–305
Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer, Dordrecht
Puterman ML (1994) Markov decision processes. Wiley, New York
Serin Y, Kulkarni V (2005) Markov decision processes under observability constraints. Math Meth Oper Res 61:311–328
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by NSFC, NCET and RFDP.
Rights and permissions
About this article
Cite this article
Zhang, L., Guo, X. Constrained continuous-time Markov decision processes with average criteria. Math Meth Oper Res 67, 323–340 (2008). https://doi.org/10.1007/s00186-007-0154-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-007-0154-0
Keywords
- Continuous-time Markov decision process
- Unbounded reward/cost and transition rates
- Average criteria
- Constrained-optimal policy