Abstract
Given a differential game, if agents have different time preference rates, cooperative (Pareto optimum) solutions obtained by applying Pontryagin’s maximum principle become time inconsistent. We derive a set of dynamic programming equations in continuous time whose solutions are time-consistent equilibria for problems in which agents differ in their utility functions and also in their time preference rates. The solution assumes cooperation between agents at every time. Since coalitions at different times have different time preferences, equilibrium policies are calculated by looking for Markov (subgame perfect) equilibria in a (noncooperative) sequential game. The results are applied to the study of a cake-eating problem describing the management of a common property exhaustible natural resource. The extension of the results to a simple common property renewable natural resource model in infinite horizon is also discussed.



Similar content being viewed by others
Notes
We refer to the preliminary version of this paper, de-Paz et al. [6], for the details.
Along the paper, we will omit the subindex in x t if it is not strictly necessary.
As in the standard case, the same DPE is obtained if x(T) is fixed.
References
Barro, R.J. (1999). Ramsey meets Laibson in the neoclassical growth model. Quarterly Journal of Economics, 114, 1125–1152.
Breton, M., & Keoula, M.Y. (2010). A great fish war model with asymmetric players. GERAD Working paper 2010–73.
Breton, M., & Keoula, M.Y. (2012). Farsightedness in a coalitional great fish war. Environmental and Resource Economics, 51(2), 297–315.
Clark, C.W. (1990). Mathematical bioeconomics: the optimal management of renewable resources. Wiley, New York.
Clemhout, S., & Wan, H.Y. (1985). Dynamic common property resources and environmental problems. Journal of Optimization Theory and Applications, 46(4), 471–481.
de-Paz, A., Marín-Solano, J., Navas, J. (2011). Time consistent Pareto solutions in common access resource games with asymmetric players. Documents de treball (Facultat d’Economia i Empresa. Espai de Recerca en Economia), E11/253.
Dockner, E. J., Jorgensen, S., Long, N.V, Sorger, G. (2000). Differential games in economics and management science. Cambridge University Press, Cambridge.
Ekeland, I., & Pirvu, T. (2008). Investment and consumption without commitment. Mathematics and Financial Economics, 2(1), 57–86.
Frederick, S., Loewenstein, G., O’Donoghue, T. (2002). Time discounting and time preference: a critical review. Journal of Economic Literature 40, 351–401.
Gollier, C., & Zeckhauser, R. (2005). Aggregation of heterogeneous time preferences. The Journal of Political Economy, 113(4), 878–896.
Haurie, A. (1976). A note on nonzero-sum differential games with bargaining solution. Journal of Optimization Theory and Applications, 18(1), 31–39.
Jouini, E., Marin, J.M., Napp, C. (2010). Discounting and divergence of opinion. Journal of Economic Theory, 145, 830–859.
Jorgensen, S., Martín-Herran, G., Zaccour, G. (2010). Dynamic games in the economics and management of pollution. Environmental Modeling and Assessment, 15, 433–467.
Karp, L. (2007). Non-constant discounting in continuous time. Journal of Economic Theory, 132, 557–568.
Karp, L., & Tsur, Y. (2011). Time perspective and climate change policy. Journal of Environmental Economics and Management, 62(1), 1–14.
Li, C.Z., & Löfgren, K.G. (2000). Renewable resources and economic sustainability: a dynamic analysis with heterogeneous time preferences. Journal of Enviromental Economics and Management 40, 236–250.
Long, N.V. (2011). Dynamic games in the economics of natural resources: a survey. Dynamic Games and Applications, 1(1), 115–148.
Long, N.V., Shimomura, K., Takahashi, H. (1999). Comparing open-loop with Markov equilibria in a class of differential games. The Japanese Economic Review, 50(4), 457–469.
Marín-Solano, J., & Navas, J. (2009). Non-constant discounting in finite horizon: the free terminal time case. Journal of Economic Dynamics and Control, 33(3), 666–675.
Marín-Solano, J., & Patxot, C. (2012). Heterogeneous discounting in economic problems. Optimal Control Applications and Methods, 33(1), 32–50.
Marín-Solano, J., & Shevkoplyas, E.V. (2011). Non-constant discounting and differential games with random horizon. Automatica, 47(12), 2626–2638.
Petrosyan, L.A. and Zaccour, G. (2003). Time-consistent Shapley value allocation of pollution cost reduction. Journal of Economic Dynamics and Control, 27, 381–398.
Pollak, R.A. (1968). Consistent planning. Review of Economic Studies 35, 201–208.
Sorger, G. (2006). Recursive bargaining over a productive asset. Journal of Economic Dynamics and Control, 30, 2637–2659.
Strotz, R.H. (1956). Myopia and inconsistency in dynamic utility maximization. Review of Economic Studies 23, 165–180.
Yeung, D.W.K., & Petrosyan, L.A. (2006). Cooperative stochastic differential games. Springer, New York.
Zaccour, G. (2008). Time consistency in cooperative differential games: a tutorial. INFOR, 46(1), 81–92.
Acknowledgements
The authors acknowledge the referee and the associate editor for their valuable comments. This work has been partially supported by MEC (Spain) Grant ECO2010-18015. J. Navas also acknowledges financial support from the Consejería de Educación de la Junta de Castilla y León (Spain) under project VA056A09.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Solution of the DPE (16) We guess for a value function of the form W(x,y,t) = A(t)ln (x) + B(t)y + C(t). If this choice proves to be consistent, the extraction rates for both agents are given by c 1(t) = 1/W x = x/A(t) and c 2(t) = W y / W x = B(t)x/A(t). In order to solve Eq. 16, we calculate the expression for K(x,t). To do that, we substitute our “guessed” controls in Eq. 15. Hence, \(x(s)=x_t \exp\left(\Lambda_t(s)\right)\), with \(\Lambda_t(s)=-\int_t^s \frac{1+B(\tau)}{A(\tau)} d \tau\). Therefore, \(K(x,y,t)= (r_1-r_2)\int_t^T e^{-r_1(s-t)} \ln \big( \frac{x_t e^{\Lambda_t(s)}}{A(s)} \big) ds = \frac{r_1-r_2}{r_1}\big( 1-e^{-r_1(T-t)}\big) \ln(x_t) + (r_1-r_2)\int_t^T e^{-r_1(s-t)} \ln \big( \frac{e^{\Lambda_t(s)}}{A(s)} \big) ds\). By substituting in Eq. 16 and simplifying, we obtain
Since the above equation must be satisfied for every x and y, then
Using the terminal condition B(T) = 1, we obtain B(t) = 1 and c 1(t) = c 2(t) = x/A(t), for every t ∈ [0,T]. With respect to A(t), note that if \(A(t)=\sum_{i=1}^2 \frac{1-e^{-r_i(T-t)}}{r_i}\), which describes the solution for a naive coalition (see (14)), then (38) is satisfied and, in addition, the solution to the state equation \(\dot{x}(t)=-2 x(t)/A(t)\) verifies the terminal condition lim t→T x(t) = 0. Therefore, the naive solution verifies the DPE (16).
Derivation of the Dynamic Programming Algorithm in Discrete Time (21) In the final period, we define \(V_n^*=0\), as usual. For j = n − 1, the optimal value for (19) will be given by the solution to the problem
with x n = x (n − 1) + f(x (n − 1), u (n − 1),(n − 1)ε)ε. If \(c^*_{(n-1)}(x_{(n-1)},(n-1)\epsilon)\) is the maximizer of the right-hand term of the above equation, let us denote
In general, for j = 1,...,n − 1, the value \(V_j^*(x_j,j\epsilon)\) in (19) can be written as
Since
then we can write \(V_{(j+1)}^*(x_{(j+1)},(j+1)\epsilon) - \sum_{i=0}^{n-j-2}\) \( \sum_{m=1}^N \,\,\lambda_m e^{-r_m i \epsilon} \,\,{\bar U}_{(j+i+1)}^m \,\,(x_{(j+i+1)},\,\,(j\,+\,i\,+\,1)\epsilon)\epsilon \,=\,0\). Adding the former expression to (39), we obtain (21).
Derivation of the DPE in Continuous Time (22) Let W m(x,t) be a continuously differentiable function representing the value function of player m in the t coalition, and let \(W(x,t)=\sum_{m=1}^N W^m(x,t)\) be the value function for the t coalition, with initial condition x(t) = x. Since s = jε, for sufficiently small ε, x(s + ε) − x(s) ≅ f(x(s),c(s),s)ε, W(x(t),t) ≅ V j (x j , jε), and \(W(x(t+\epsilon),t+\epsilon)=W(x(t),t)+\nabla_xW(x(t),t)\cdot\) \( f(x(t),c(t),t)\epsilon+\nabla_t W(x(t),t)\epsilon+o(\epsilon)\). Substituting in (21), we obtain
where
Finally, by dividing (40) and (41) by ε and taking the limit ε→0, we obtain Eq. 22.
Derivation of the DPE (32) Without lack of generality, for simplicity, we take λ 1 = ⋯ = λ N = 1. If c *(s) = φ(s,x(s)) is the equilibrium rule, then the value function is
where \(\dot{x}(s)=f(x(s),\phi(x(s),s),s)\), x(t) = x t . We assume that if τ = ∞, along the equilibrium rule, the value function (42) is finite (i.e., the integral converges). This is guaranteed if we restrict our attention to strategies φ(x,s) of class C 1 such that, when t→ ∞, the state variables converge to a stationary state.
Next, for ε > 0, let us consider the variations
If the t agent can precommit her behavior during the period [t,t + ε], the value function for the perturbed control path c ε is given by
Let us assume that W ε is differentiable in ε in a neighborhood of ε = 0. Then, c *(s) = φ(s,x(s)) is called an equilibrium rule if
The above definition can be interpreted as follows: for sufficiently small ε, the maximum of W ε in the limit when ε = 0 is precisely W(x,t). In order to prove that c *(t) = φ(x,t) solving the right-hand term in Eq. 32 is an equilibrium rule, we have to check Eq. 43. We do it in several steps:
If \(\bar{x}(s)\) denotes the state trajectory corresponding to the decision rule c ε (s), then
Note that
In a similar way,
Therefore,
since c * = φ(x,t) is the maximizer of the right-hand term in Eq. 32.
Rights and permissions
About this article
Cite this article
de-Paz, A., Marín-Solano, J. & Navas, J. Time-Consistent Equilibria in Common Access Resource Games with Asymmetric Players Under Partial Cooperation. Environ Model Assess 18, 171–184 (2013). https://doi.org/10.1007/s10666-012-9339-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10666-012-9339-x