Constrained continuous-time Markov decision processes with average criteria | Mathematical Methods of Operations Research Skip to main content
Log in

Constrained continuous-time Markov decision processes with average criteria

  • Original Article
  • Published:
Mathematical Methods of Operations Research Aims and scope Submit manuscript

Abstract

In this paper, we study constrained continuous-time Markov decision processes with a denumerable state space and unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is imposed on an expected average cost. We give suitable conditions that ensure the existence of a constrained-optimal policy. Moreover, we show that the constrained-optimal policy randomizes between two stationary policies differing in at most one state. Finally, we use a controlled queueing system to illustrate our conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Alvarez-Mena J, Hernández-Lerma O (2002) Convergence of the optimal values of constrained Markov control processes. Math Meth Oper Res 55(3):461–484

    Article  MATH  Google Scholar 

  • Anderson WJ (1991) Continuous-time Markov chains. Springer, Heidelberg

    MATH  Google Scholar 

  • Beutler FJ, Ross KW (1985) Optimal policies for controlled Markov chains with a constraint. J Math Anal Appl 112:236–252

    Article  MATH  MathSciNet  Google Scholar 

  • Cao XR (1998) The relations among potentials,perturbation analysis, and Markov decision processes. Discrete Event Dyna Syst Theor Applications 8(1):71–87

    Article  MATH  Google Scholar 

  • Chung KL (1960) Markov Chains with stationary transition probabilities. Springer, Heidelberg

    MATH  Google Scholar 

  • Feinberg EA (2000) Constrained discounted Markov decition processes and hamiltonian cycles. Math Oper Res 25(1):130–140

    Article  MATH  MathSciNet  Google Scholar 

  • Guo XP (2000) Constrained nonhomogeneous Markov decision processes with expected total reward criterion. Acta Appl Math Sin (English Ser) 23:230–235

    Google Scholar 

  • Guo XP, Cao XR (2005) Optimal control of ergodic continous-time Markov chains with average sample-path rewards. SIAM J Control Optim 44(1):29–48

    Article  MATH  MathSciNet  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003a) Continuous-time controlled Markov chains with discounted rewards. Acta Appl Math 79:195–216

    Article  MATH  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003b) Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans Automatic Control 48(2):236–245

    Article  Google Scholar 

  • Guo XP, Hernández-Lerma O (2003c) Constrained continuous-time Markov control processes with discounted criteria. Stoch Anal Appl 21(2):379–399

    Article  MATH  Google Scholar 

  • Guo XP, Liu K (2001) A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans Automatic Control 46:1984–1988

    Article  MATH  MathSciNet  Google Scholar 

  • Guo XP, Zhu WP (2002a) Denumerable state continuous-time Markov decision processes with unbounded cost and transition rates under average criterion. ANZIAM J 43:541–551

    MATH  MathSciNet  Google Scholar 

  • Guo XP, Zhu WP (2002b) In: Hou ZT, Filar JA, Chan AY, Eds Dordrecht (eds) Optimality conditions for continuous-time Markov decision processes with average cost criterion, in Markov processes and controlled Markv chains. Kluwer, The Netherlands

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, Heidelberg

    MATH  Google Scholar 

  • Hernández-Lerma O (1994) Lectures on continuous-time Markov control processes, vol. 3. Sociedad Matematica Mexicana, Mexico

  • Hernández-Lerma O, Govindan TE (2001) Nonstationary continuous-time Markov control processes with discounted costs on infinite horizon. Acta Appl Math 67:277–293

    Article  MATH  MathSciNet  Google Scholar 

  • Hernández-Lerma O, González-Hernández J (2000) Constrained Markov control processes in Borel spaces: the discounted case. Math Meth Oper Res 52(2):271–285

    Article  MATH  Google Scholar 

  • Horiguchi M (2001) Markov decision processes with a stopping time constraint. Math Meth Oper Res 53:279–295

    Article  MATH  MathSciNet  Google Scholar 

  • Hou ZT, Zou JZ, Zhang HJ, Liu ZM, Xiao GN, Chen AY, Fei LZ (1994) The Q-matrix problems for Markov processes. Science and Technology Press of Hunan, Changsha

    Google Scholar 

  • Kakumanu P (1972) Non-discounted continuous-time Markov decision processes with countable state space. SIAM J Control 10:210–220

    Article  MATH  MathSciNet  Google Scholar 

  • Kakumanu P (1975) Continuous-time Markov decision processes with average return criterion. J Math Anal Appl 52:173–188

    Article  MATH  MathSciNet  Google Scholar 

  • Lewis ME, Puterman ML (2000) A note on bias optimality in controlled queueing systems. J Appl Prob 37:300–305

    Article  MATH  MathSciNet  Google Scholar 

  • Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer, Dordrecht

    MATH  Google Scholar 

  • Puterman ML (1994) Markov decision processes. Wiley, New York

    MATH  Google Scholar 

  • Serin Y, Kulkarni V (2005) Markov decision processes under observability constraints. Math Meth Oper Res 61:311–328

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianping Guo.

Additional information

Supported by NSFC, NCET and RFDP.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Guo, X. Constrained continuous-time Markov decision processes with average criteria. Math Meth Oper Res 67, 323–340 (2008). https://doi.org/10.1007/s00186-007-0154-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00186-007-0154-0

Keywords

AMS Classification

Navigation