Markov control processes with pathwise constraints

Mendoza-Pérez, Armando F.; Hernández-Lerma, Onésimo

doi:10.1007/s00186-010-0311-8

Markov control processes with pathwise constraints

Original Article
Published: 04 June 2010

Volume 71, pages 477–502, (2010)
Cite this article

Mathematical Methods of Operations Research Aims and scope Submit manuscript

Armando F. Mendoza-Pérez¹ &
Onésimo Hernández-Lerma²

159 Accesses
Explore all metrics

Abstract

This paper deals with discrete-time Markov control processes in Borel spaces, with unbounded rewards. The criterion to be optimized is a long-run sample-path (or pathwise) average reward subject to constraints on a long-run pathwise average cost. To study this pathwise problem, we give conditions for the existence of optimal policies for the problem with “expected” constraints. Moreover, we show that the expected case can be solved by means of a parametric family of optimality equations. These results are then extended to the problem with pathwise constraints.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

References

Altman E (1999) Constrained Markov decision processes. Chapman & Hall/CRC, Boca Raton
MATH Google Scholar
Borkar VS (1994) Ergodic control of Markov chains with constraints—the general case. SIAM J Control Optim 32: 176–186
Article MATH MathSciNet Google Scholar
Ding Y, Jia R, Tang S (2003) Dynamical principal agent model based on CMCP. Math Methods Oper Res 58: 149–157
Article MATH MathSciNet Google Scholar
Djonin DV, Krishnamurthy V (2007) MIMO transmission control in fading channels—a constrained Markov decision process formulation with monotone randomized policies. IEEE Trans Signal Process 55: 5069–5083
Article Google Scholar
Ekeland I, Temam R (1976) Convex analysis and variational problems. North-Holland, Amsterdam
MATH Google Scholar
Feinberg E, Shwartz A (1996) Constrained discounted dynamic programming. Math Oper Res 21: 922–945
Article MATH MathSciNet Google Scholar
Föllmer H, Schied A (2002) Stochastic finance. An introduction in discrete time. Walter de Gruyter & Co, Berlin
Book MATH Google Scholar
Gordienko E, Hernández-Lerma O (1995) Average cost Markov control processes with weigthed norms: existence of canonical policies. Appl Math (Warsaw) 23: 199–218
MATH MathSciNet Google Scholar
Haviv M (1996) On constrained Markov decision processes. Oper Res Lett 19: 25–28
Article MATH MathSciNet Google Scholar
Hernández-Lerma O, González-Hernández J, López-Martínez RR (2003) Constrained average cost Markov control processes in Borel spaces. SIAM J Control Optim 42: 442–468
Article MATH MathSciNet Google Scholar
Hernández-Lerma O, Lasserre JB (1996) Discrete-time Markov control processes: basic optimality criteria. Springer, New York
Google Scholar
Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York
MATH Google Scholar
Hernández-Lerma O, Lasserre JB (2003) Markov chains and invariant probabilities. Birkhäuser Verlag, Basel
MATH Google Scholar
Hernández-Lerma O, Vega-Amaya O (1998) Infinite-horizon Markov control processes with undiscounted cost criteria: from average to overtaking optimality. Appl Math (Warsaw) 25: 153–178
MATH MathSciNet Google Scholar
Hernández-Lerma O, Vega-Amaya O, Carrasco G (1999) Sample-path optimality and variance-minimization of average cost Markov control processes. SIAM J Control Optim 38: 79–93
Article MATH MathSciNet Google Scholar
Kartashov HV (1985) Inequalities in theorems of ergodicity and stability of Markov chains with common phase space. II. Theory Probab Appl 30: 507–515
Article Google Scholar
Krishnamurthy V, Vázquez Abad F, Martin K (2003) Implementation of gradient estimation to a constrained Markov decision problem. In: 42nd IEEE conference on decision and control pp 4841–4846
Mendoza-Pérez A (2008) Pathwise average reward Markov control processes. Doctoral thesis, CINVESTAV-IPN, México. Available at http://www.math.cinvestav.mx/ohernand_students
Mendoza-Pérez A, Hernández-Lerma O (2009) Markov control processes with pathwise constraints (longer version). Available at http://www.math.cinvestav.mx/sites/default/files/art-MMOR.pdf
Meyn SP, Tweedie RL (1993) Markov chains and stochastic stability. Springer, London
MATH Google Scholar
Prieto-Rumeau T, Hernández-Lerma O (2008) Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J Control Optim 47: 1888–1908
Article MATH MathSciNet Google Scholar
Piunovskiy AB (1997) Optimal control of random sequences in problems with constraints. Kluwer, Boston
MATH Google Scholar
Puterman ML (1994) Markov decision process. Wiley, New York
Book Google Scholar
Ross KW, Varadarajan R (1989) Markov decision processes with sample path constraints. Oper Res 37: 780–790
Article MATH MathSciNet Google Scholar
Ross KW, Varadarajan R (1991) Multichain Markov decision processes with a sample path constraint. Math Oper Res 16: 195–207
Article MATH MathSciNet Google Scholar
Vega-Amaya O (1998) Markov control processes in Borel spaces: undiscounted criteria. Doctoral thesis, UAM-Iztapalapa, México (In Spanish)
Vega-Amaya O, Montes-de-Oca R (1998) Application of average dynamic programming to inventory systems. Math Methods Oper Res 47: 451–471
Article MATH MathSciNet Google Scholar
Vega-Amaya O (2003) The average cost optimality equation: a fixed point approach. Bol Soc Mat Mexicana 9: 185–195
MATH MathSciNet Google Scholar
Vega-Amaya O, Expected and sample-path constrained average Markov decision processes, Internal Report no. 35, Departamento de Matemáticas, Universidad de Sonora. (Submitted)

Download references

Author information

Authors and Affiliations

Universidad Politécnica de Chiapas, Calle Eduardo J. Selvas S/N, Tuxtla Gutiérrez, Chiapas, Mexico
Armando F. Mendoza-Pérez
Mathematics Department, CINVESTAV-IPN, A. Postal 14-740, Mexico, DF, 07000, Mexico
Onésimo Hernández-Lerma

Authors

Armando F. Mendoza-Pérez
View author publications
You can also search for this author in PubMed Google Scholar
Onésimo Hernández-Lerma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Onésimo Hernández-Lerma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mendoza-Pérez, A.F., Hernández-Lerma, O. Markov control processes with pathwise constraints. Math Meth Oper Res 71, 477–502 (2010). https://doi.org/10.1007/s00186-010-0311-8

Download citation

Received: 28 January 2009
Accepted: 11 May 2010
Published: 04 June 2010
Issue Date: June 2010
DOI: https://doi.org/10.1007/s00186-010-0311-8

Keywords

Mathematics Subject Classification (2000)

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Markov control processes with pathwise constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Constrained Markov Decision Processes with Non-constant Discount Factor

Discrete-time control with non-constant discount factor

Optimal Control of Piecewise Deterministic Markov Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Subscribe and save

Buy Now

Navigation

Markov control processes with pathwise constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Constrained Markov Decision Processes with Non-constant Discount Factor

Discrete-time control with non-constant discount factor

Optimal Control of Piecewise Deterministic Markov Processes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Subscribe and save

Buy Now

Search

Navigation