First passage Markov decision processes with constraints and varying discount factors | Frontiers of Mathematics Skip to main content
Log in

First passage Markov decision processes with constraints and varying discount factors

  • Research Article
  • Published:
Frontiers of Mathematics in China Aims and scope Submit manuscript

Abstract

This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an (infinite-dimensional) linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altman E. Denumerable constrained Markov decision processes and finite approximations. Math Methods Oper Res, 1994, 19: 169–191

    Article  MATH  MathSciNet  Google Scholar 

  2. Altman E. Constrained Markov Decision Processes. Boca Raton: Chapman & Hall/CRC, 1999

    MATH  Google Scholar 

  3. Alvarez-Mena J, Hern´andez-Lerma O. Convergence of the optimal values of constrained Markov control processes. Math Methods Oper Res, 2002, 55: 461–484

    Article  MATH  MathSciNet  Google Scholar 

  4. Borkar V. A convex analytic approach to Markov decision processes. Probab Theory Related Fields, 1988, 78: 583–602

    Article  MATH  MathSciNet  Google Scholar 

  5. Derman C. Finite State Markovian Decision Processes. Mathematics in Science and Engineering, Vol 67. New York: Academic Press, 1970

  6. Dufour F, Prieto-Rumeau T. Finite linear programming approximations of constrained discounted Markov decision processes. SIAM J Control Optim, 2013, 51: 1298–1324

    Article  MATH  MathSciNet  Google Scholar 

  7. Gonz´alez-Hern´andez J, Hern´andez-Lerma O. Extreme points of sets of randomized strategies in constrained optimization and control problems. SIAM J Optim, 2005, 15: 1085–1104

    Article  MathSciNet  Google Scholar 

  8. Guo X P, Hern´andez-del-Valle A, Hern´andez-Lerma O. First passage problems for nonstationary discrete-time stochastic control systems. Eur J Control, 2012, 18: 528–538

    Article  MATH  MathSciNet  Google Scholar 

  9. Guo X P, Hern´andez-Lerma O. Continuous-Time Markov Decision Processes: Theory and Applications. Berlin: Springer-Verlag, 2009

    Book  Google Scholar 

  10. Guo X P, Piunovskiy A. Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math Oper Res, 2011, 36(1): 105–132

    Article  MATH  MathSciNet  Google Scholar 

  11. Guo X P, Song X Y, Zhang Y. First passage criteria for continuous-time Markov decision processes with varying discount factors and history-dependent policies. IEEE Trans Automat Control, 2014, 59: 163–174

    Article  MathSciNet  Google Scholar 

  12. Guo X P, Zhang W Z. Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints. Eur J Control, 2014, 238: 486–496

    Google Scholar 

  13. Hern´andez-Lerma O, Gonz´alez-Hern´andez J. Constrained Markov decision processes in Borel spaces: the discounted case. Math Methods Oper Res, 2000, 52: 271–285

    Article  MathSciNet  Google Scholar 

  14. Hern´andez-Lerma O, Lasserre J B. Discrete-Time Markov Control Processes. Basic Optimality Criteria. New York: Springer-Verlag, 1996

    Google Scholar 

  15. Hern´andez-Lerma O, Lasserre J B. Further Topics on Discrete-Time Markov Control Processes. New York: Springer-Verlag, 1999

    Book  Google Scholar 

  16. Huang Y H, Wei Q D, Guo X P. Constrained Markov decision processes with first passage criteria. Ann Oper Res, 2013, 206: 197–219

    Article  MATH  MathSciNet  Google Scholar 

  17. Mao X, Piunovskiy A. Strategic measures in optimal control problems for stochastic sequences. Stoch Anal Appl, 2000, 18: 755–776

    Article  MATH  MathSciNet  Google Scholar 

  18. Piunovskiy A. Optimal Control of Random Sequences in Problems with Constraints. Dordrecht: Kluwer Academic, 1997

    Book  MATH  Google Scholar 

  19. Piunovskiy A. Controlled random sequences: the convex analytic approach and constrained problems. Russian Math Surveys, 2000, 53: 1233–1293

    Article  MathSciNet  Google Scholar 

  20. Prokhorov Y. Convergence of random processes and limit theorems in probability theory. Theory Probab Appl, 1956, 1: 157–214

    Article  Google Scholar 

  21. Saldi N, Linder T, Y¨uksel S. Asymptotic optimality and rates of convergence of quantized stationary policies in stochastic control. IEEE Trans Automat Control, 2015, 60 (to appear)

  22. Sennott L I. Stochastic Dynamic Programming and the Control of Queueing Systems. New York: Wiley, 1999

    MATH  Google Scholar 

  23. Wei Q D, Guo X P. Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Oper Res Lett, 2011, 39: 369–374

    MATH  MathSciNet  Google Scholar 

  24. Wu X, Guo X P. First passage optimality and variance minimization of Markov decision processes with varying discount factors. J Appl Probab, 2015, 52(2) (to appear)

  25. Zhang Y. Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factors. TOP, 2013, 21: 378–408

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianping Guo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, X., Zou, X. & Guo, X. First passage Markov decision processes with constraints and varying discount factors. Front. Math. China 10, 1005–1023 (2015). https://doi.org/10.1007/s11464-015-0479-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11464-015-0479-6

Keywords

MSC

Navigation