Abstract
In this paper, the identifiability and the estimability of a particular phase-type ageing model (PTAM) are investigated. We consider a PTAM that is shown to be identifiable, although it has poor estimability when the only observation is the time until absorption. We use a data-cloning method to assess model estimability, which is also analysed visually via contour plots and marginal likelihood functions. The PTAM’s estimability under different scenarios is compared, from the best scenario of a fully observable Markov process for each individual to the worst scenario wherein the only observation is the time until absorption for each individual. Some in-between scenarios are also studied and include the situations where the state could be observed every several years and the state could be measured with some error. Certain conditions are provided regarding the state of the Markov chain that improves the PTAM’s estimability.













Similar content being viewed by others
References
Albrecher H, Bladt M, Bladt M, Yslas J (2022) Mortality modeling and regression with matrix distributions. Insur Math Econ 107:68–87
Alalouf I, Styan G (1979) Characterizations of estimability in the general linear model. Ann Stat 7(1):194–200
Asmussen S, Nerman O, Olsson M (1996) Fitting phase-type distributions via the EM algorithm. Scandinavian J Stat 23:419–441
Augustin R, Büscher K (1982) Characteristics of the Cox-distribution. ACM Sigmetrics Perform Eval Rev 12(1):22–32
Bobbio A, Cumani A (1992) ML estimation of the parameters of a PH distribution in triangular canonical form, Computer. Perform Eval 22:33–46
Bolch G, Greiner S, Meer H, Trivedi K (2006) Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science applications. Wiley-Interscience, New York
Bunch D (1991) Estimability in the multinomial probit model. Transp Res Part B Method 25(1):1–12
Bunke H, Bunke O (1974) Identifiability and estimability. Stat J Theor Appl Stat 5(3):223–233
Campbell D, Lele S (2014) An ANOVA test for parameter estimability using data cloning with application to statistical inference for dynamic systems. Comput Stat & Data Anal 70:257–267
Casella G, Berger R (2021) Statistical Inference. Cengage Learning, Belmont
Cavanaugh J, Neath A (2019) The Akaike information criterion: background, derivation, properties, application, interpretation, and refinements. Wiley Interdiscip Rev Comput Stat 11(3):e1460
Cheng B, Jones B, Liu X, Ren J (2021) The mathematical mechanism of biological aging. North Am Actuar J 25:73–93
Cheng B, Mamon R (2022) A uniformisation-driven algorithm for inference-related estimation of a phase-type ageing model. Lifetime Data Anal. https://doi.org/10.1007/s10985-022-09577-1
Cox D (1955) A use of complex probabilities in the theory of stochastic processes. Math Proc Cambridge Philos Soc 51(2):313–319
Cumani A (1982) On the canonical representation of homogeneous Markov processes modelling failure-time distributions. Microelectron Reliab 22(3):583–602
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol 39(1):1–22
Deresa N, Keilegom I (2020) A multivariate normal regression model for survival data subject to different types of dependent censoring. Comput Stat & Data Anal 144:106879
Dörre A (2020) Bayesian estimation of a lifetime distribution under double truncation caused by time-restricted data collection. Stat Papers 61(3):945–965
Eberly L, Carlin B (2000) Identifiability and convergence issues for Markov chain Monte Carlo fitting of spatial models. Stat Med 19(17–18):2279–2294
Fackrell M, Characterisation of matrix-exponential distributions, PhD Thesis, University of Adelaide, Australia, (2003)
Fackrell M (2009) Modelling healthcare systems with phase-type distributions. Health Care Manage Sci 12(1):11–26
Faddy M (1994) Examples of fitting structured phase-type distributions. Appl Stoch Models Data Anal 10(4):247–255
Faddy M (1998) On inferring the number of phases in a Coxian phase-type distribution. Stoch Models 14(1–2):407–417
Faddy M, Latouche G, Taylor P (2002) Penalised maximum likelihood estimation of the parameters in a Coxian phase-type distribution. In: Latouche G, Taylor P (eds) Matrix-analytic methods: theory and applications. World Scientific, New Jersey, pp 107–114
Fisher F (1966) The identification problem in econometrics. McGraw-Hill, New York
Gibbs J (1902) Elementary principles in statistical mechanics: developed with special reference to the rational foundation of thermodynamics. Charles Scribner’s Sons, New York
Gompertz B (1825) On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. Philos Trans R Soc London 115:513–583
Govorun M, Jones B, Liu X, Stanford D (2018) Physiological age, health costs, and their interrelation. North Am Actuar J 22(3):323–340
Govorun M, Latouche G, Loisel S (2015) Phase-type aging modeling for health dependent costs. Insur Math Econ 62:173–183
Harper P, Knight V, Marshall A (2012) Discrete conditional phase-type models utilising classification trees: Application to modelling health service capacities. Eur J Operat Res 219(3):522–530
Hassan A, Jones B, Stanford D (2014) The use of phase-type models for disability insurance calculations. Scandinavian Actuar J 2014(8):714–728
Hastings W (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109
Hengl S, Kreutz C, Timmer J, Maiwald T (2007) Data-based identifiability analysis of non-linear dynamical models. Bioinformatics 23(19):2612–2618
Hsiao C (1983) Identification, Handbook of Econometrics 1. North Holland, Elsevier, pp 223–283
Huzurbazar A (1999) Flowgraph models for generalized phase type distributions having non-exponential waiting times. Scandinavian J Stat 26(1):145–157
Jacquez J, Greif P (1985) Numerical parameter identifiability and estimability: Integrating identifiability, estimability, and optimal sampling design. Math Biosci 77(1–2):201–227
Kullback S, Leibler R (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Kullback S (1997) Information theory statistics. Courier Corporation, New York
Lehmann E, Casella G (2006) Theory of point estimation. Springer Science & Business Media, New York
Lele S, Nadeem K, Schmuland B (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. J Am Stat Assoc 105(492):1617–1625
Lin X, Liu X (2007) Markov aging process and phase-type law of mortality. North Am Actuar J 11:92–109
Makeham W (1860) On the law of mortality and the construction of annuity tables. The Assurance Magazine, and Journal of the Institute of Actuaries 8(6):301–310
Makeham W (1889) On the further development of Gompertz’s law. J Inst Actuar (1886-1994) 28(2):152–159
Marshall A, Zenga M (2009) Simulating Coxian phase-type distributions for patient survival. Int Trans Operat Res 16(2):213–226
McClean S, Gillespie J, Garg L, Barton M, Scotney B, Kullerton K (2014) Using phase-type models to cost stroke patient care across health, social and community services. Eur J Operat Res 236(1):190–199
McLean K, McAuley K (2012) Mathematical modelling of chemical processes-obtaining the best model predictions and parameter estimates using identifiability and estimability procedures. Can J Chem Eng 90(2):351–366
Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092
Miettinen O (1976) Estimability and estimation in case-referent studies. Am J Epidemiol 103(2):226–235
Moriguchi S, Murota K (2012) On discrete Hessian matrix and convex extensibility. J Operat Res Soc Japan 55(1):48–62
Mu J (2019) Exploring the Estimability of Mark-Recapture Models with Individual, Time-Varying Covariates using the Scaled Logit Link Function, MSc Thesis, The University of Western Ontario, Canada
Neath A, Cavanaugh J (2012) The Bayesian information criterion: background, derivation, and applications. Wiley Interdiscip Rev Comput Stat 4(2):199–203
Nelder J, Mead R (1965) A simplex method for function minimization. Comput J 7(4):308–313
Payne K, Marshall H, Cairns K (2012) Investigating the efficiency of fitting Coxian phase-type distributions to health care data. IMA J Manage Math 23(2):133–145
Putter H, Houwelingen H (2015) Frailties in multi-state models: are they identifiable? Do we need them? Statistical Methods Med Res 24(6):675–692
Ramírez-Cobo P, Lillo R, Wiper M (2010) Nonidentifiability of the two-state Markovian arrival process. J Appl Prob 47(3):630–649
Titman A, Sharples L (2010) semi-Markov models with phase-type sojourn distributions. Biometrics 66(3):742–752
Vrieze S (2012) Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychol Methods 17(2):228–243
Acknowledgements
We acknowledge the support of the Natural Sciences and Engineering Research Council of Canada through R. Mamon’s Discovery Grant (RGPIN-2017-04235). The valuable comments of Bruce Jones, the anonymous reviewers, and the editor are highly appreciated.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendices
Appendix A: Flexibility of the PTAM’s resulting distribution
Obtaining a reasonably good fit for the PTAM was demonstrated in Sections 3.2 and 4.2 of Cheng et al. (2021). This motivated us to explore the flexibility of the PTAM. In particular, we consider the question: Can the PTAM achieve a good fit on any set of lifetime data? The assessment of the PTAM’s flexibility will be based on how well the resulting pdf and hazard function could approximate the true ones. Both functions are easily obtained from the data and commonly used to validate the calibrated result.
Definition A.1
We define i as the state label for states \(i=1,\ldots ,m\) in a Coxian model with m transient states.
Lemma A.1
Consider an \(m-\)state Coxian model with transition rate from one transient state to the next transient state equal to \(\lambda\), and the absorption rate in state i is \(h_i\) for \(i=1,\ldots ,m\). For any \(t>0\), let \(Y_t\) be the state variable at time t. Then, \(P(Y_t \ge k|Y_t \in E)\) is an increasing function with respect to t for any \(k=1,\ldots ,m\).
Proof
The \(m\times m\) intensity matrix of the Coxian model is
where \(\lambda\) is the rate of transition from state i to \(i+1\), and \(h_i\) is the absorption rate at state i. For any fixed \(t>0\), let \(Y_{\ell ,t}\) be the state random variable at time t assuming the starting state is \(\ell\) at 0, then \(Y_t=Y_{1,t}\). For every \(k=1,{\ldots },m\),
where \(\varvec{e_{k}}\) is an \(m\times 1\) column vector with first \(k-1\) elements equal to 0 and the remaining elements are equal to 1. The quantity \(P(Y_t \ge k|Y_t \in E)\) is the probability that the state of the individual at time t is greater than or equal to k conditional on this individual being alive. We note that \(P(Y_t \ge 1|Y_{t} \in E)=1\). When \(k \ne 1\), the first derivative of \(P(Y_t \ge k|Y_t \in E)\) with respect to t is
Let \(g(t)=\varvec{\alpha }e^{\varvec{\Lambda }t}(\varvec{\Lambda }\varvec{e_{k}}\varvec{\alpha } -\varvec{e_{k}}\varvec{\alpha }\varvec{\Lambda })e^{\varvec{\Lambda }t}\varvec{e}\). The (i, j) entry of \(e^{\varvec{\Lambda }t}\), denoted by \(P_{i,j}(t)\), is the probability of being in state j at time t given that the starting state is i at time 0. It may be verified that
where \(\lambda\) is the \((k-1,1)\) entry of \(\varvec{\Lambda }\varvec{e_{k}}\varvec{\alpha }\) and the entries of the entire first \((k-1)\)th rows in \(\varvec{e_{k}}\varvec{\alpha }\varvec{\Lambda }\) are 0’s.
Remark A.1
Bold subscripts are used in the vector notation, and normal subscripts are associated to an element of a vector. For example, since \(\varvec{A_k}\) is a vector, \(\varvec{A}_k\) is the kth element of vector \(\varvec{A}\).
Hence, \(\varvec{\Lambda }\varvec{e_{k}}\varvec{\alpha } -\varvec{e_{k}}\varvec{\alpha }\varvec{\Lambda } =(\varvec{\beta _1},\varvec{\beta _2},\varvec{0},\)...\(,\varvec{0})^{\intercal }\), where \(\varvec{\beta _2}\) is an \(m\times 1\) column vector with the first \(k-1\) elements all equal to 0 and the remaining elements are all equal to \(-\lambda\); and \(\varvec{\beta _1}\) is an \(m\times 1\) column vector with the first \(k-2\) elements equal to 0, the \((k-1)\)th element equals \(\lambda\), and the \(\ell\)th element (\(\ell \ge k\)) equals \(\lambda +h_1-h_{\ell }\). In matrix form,
Then, g(t) simplifies to
On the other hand, we know that
yielding
Rearranging further, we have
The first part of g(t) could be written as
Consequently,
which shows that \(g(t) \ge 0\) if and only if
with the equality holds if and only if \(g(t)=0\). Similarly, \(\frac{\textrm{d}P(Y_{\ell ,t}\ge k|Y_{\ell ,t} \in E)}{\textrm{d}t} \ge 0\) if and only if
The inequality is trivial when \(\ell =m-1\) because \(P(Y_{m,t}\ge k|Y_{m,t} \in E)=1\) and \(P(Y_{\ell ,t}\ge k|Y_{\ell ,t} \in E) \le 1\). Recall that \(Y_{\ell ,t}\) is the state variable at time t assuming the starting state is \(\ell\) at time 0. By letting \(t=0\), it could be verified that \(P(Y_{\ell ,t=0}\ge k|Y_{\ell ,t=0} \in E)=1\) when \(\ell \ge k\) and \(P(Y_{\ell ,t=0}\ge k|Y_{\ell ,t=0} \in E)=0\) when \(\ell < k\).
Since (A.1) holds when \(\ell =m-1\), we have \(\frac{\textrm{d}P(Y_{m-1,t}\ge k|Y_{m-1,t} \in E)}{\textrm{d}t} \ge 0\), or \(P(Y_{m-1,t}\ge k|Y_{m-1,t} \in E)\) is increasing with respect to t.
The next step is to prove \(P(Y_{m-2,t}\ge k|Y_{m-2,t} \in E) \le P(Y_{m-1,t}\ge k|Y_{m-1,t} \in E)\) by contradiction. Suppose
By (A.1), we have \(\frac{{\textrm{d}}P(Y_{m-2,t}\ge k|Y_{m-2,t} \in E)}{{\textrm{d}}t}<0\), or \(P(Y_{m-2,t}\ge k|Y_{m-2,t} \in E)\) is decreasing with respect to t. Therefore, for any \(t>0\),
which conflicts with (A.2). The first inequality holds because \(P(Y_{m-2,t}\ge k|Y_{m-2,t} \in E)\) is decreasing with respect to t. The second inequality holds because \(P(Y_{\ell ,t=0}\ge k|Y_{\ell ,t=0} \in E)\) is equal to 1 when \(\ell \ge k\); otherwise, it is equal to 0, so that \(P(Y_{m-1,t=0}\ge k|Y_{m-1,t=0} \in E)\) must be 1 if \(P(Y_{m-2,t=0}\ge k|Y_{m-2,t=0} \in E)=1\) for any k. The third inequality holds because we proved \(P(Y_{m-1,t}\ge k|Y_{m-1,t} \in E)\) is an increasing function of t.
Hence, (A.2) is false, and it must be that
which shows \(P(Y_{m-2,t}\ge k|Y_{m-2,t} \in E)\) is increasing with respect to t by (A.1). Following the same procedure recursively, it could be shown that \(P(Y_{1,t}\ge k|Y_{1,t} \in E) \le P(Y_{2,t}\ge k|Y_{2,t} \in E)\). Therefore, \(g(t) \ge 0\) and \(\frac{\textrm{d}P(Y_{t}\ge k|Y_{t} \in E)}{\textrm{d}t} \ge 0\), implying that \(P(Y_{t}\ge k|Y_{t} \in E)\) is an increasing function of t. \(\square\)
Theorem A.1
Consider the Coxian model with 3 restrictions: (i) the process starts in state 1 at time 0; (ii) the transition of going out from each state must be either to the next transient state or to the absorbing state; and (iii) the rates of transition from one transient state to the next transient state are the same. Suppose the absorption rate of the Coxian model is monotone with respect to the state label. The resulting hazard rate h(t) of the Coxian model is increasing with respect to age t if and only if the absorption rate \(h_i\) is increasing with respect to i.
Proof
For any \(t \ge 0\), the resulting hazard rate is
The above result shows that h(t) is the expected value of \((h_{Y_t}|Y_t \in E)\), where \(Y_t\) takes a state value at time t and E is the set of all transient states \(\{1,\ldots ,m\}\). The ith element of \(\varvec{\alpha }\exp (\varvec{\Lambda } t)\) is the probability in state i at time t.
By Lemma A.1, for any \(t_1<t_2\), \(Y_{t_1}\) is less than \(Y_{t_2}\) in stochastic order. It is well known that if the random variable A is less than the random variable B in stochastic order, then for any non-decreasing functions u, \({{\,\mathrm{\mathbb {E}}\,}}(u(A)) \le {{\,\mathrm{\mathbb {E}}\,}}(u(B))\). Hence, when \(h_i\) is increasing with respect to i, for any \(0 \le t_1<t_2\),
implying that h(t) is increasing with respect to age t.
On the other hand, when h(t) is increasing with respect to t, for any \(0 \le t_1<t_2\),
Suppose \(h_i\) is decreasing with respect to i. Then, \(-h_i\) is an increasing function. We showed that \(Y_{t_1}\) is less than \(Y_{t_2}\) in stochastic order. Therefore,
yielding \({{\,\mathrm{\mathbb {E}}\,}}\left( h_{Y_{t_1}}\big |Y_{t_1} \in E\right) \ge {{\,\mathrm{\mathbb {E}}\,}}\left( h_{Y_{t_2}}\big |Y_{t_2} \in E\right)\). Or, \(h(t_1)\ge h(t_2)\), which contradicts with h(t) being increasing. Consequently, the absorption rate \(h_i\) is increasing with respect to i.
\(\square\)
Since the PTAM’s absorption rate is monotone, the resulting hazard rate is monotone by Theorem A.1. A non-monotonic hazard rate is an impossibility for the PTAM. Therefore, we only investigate the goodness of fit of certain lifetime distributions with monotonic hazard rate.
Remark A.2
Suppose the absorption rate of the Coxian model is monotone with respect to the state label. Then, the resulting hazard rate h(t) is decreasing with respect to age t if and only if the dying rate \(h_i\) is decreasing with respect to i.
The following list includes the tested lifetime distributions we are interested in: (i) Gamma distribution with increasing hazard rate, (ii) Gamma distribution with decreasing hazard rate, (ii) Weibull distribution with increasing hazard rate, (iv) Weibull distribution with decreasing hazard rate, (v) Pareto distribution with decreasing hazard rate, (vi) Convolution of two exponential distributions, (vii) Convolution of two Weibull distributions, (viii) Gompertz-Makeham distribution, and (ix) Makeham’s second extension of the Gompertz distribution.
Our experiments demonstrate that the fitted results are promising for all tested distributions, except for the tail portion of heavy-tailed distributions. As expected, it is problematic to use light-tailed distributions, which include the PTAM, to approximate heavy-tailed distributions, because there is always a distinct tail requiring a large number of mixtures (states in the PTAM) to approximate well the tail. For example, Figures S.6 and S.8 of the Supplementary Material depict the typical fitted results for the heavy-tailed and light-tailed distributions, respectively. More discussion and fitted results are presented in Appendix B.
Appendix B: Experiments showing the flexibility of the resulting distribution from a PTAM
In this Appendix, we detail the analysis of PTAM’s flexibility as described in Appendix A. This is based on the concept of goodness-of-fit to the target distribution by the calibrated PTAM.
1.1 Appendix B.1: Some distributions for lifetime modelling
The Gamma distribution has two parameters: \(\alpha\) and \(\beta\). Its associated pdf is
where \(\Gamma (x)=\int _{z=0}^{\infty }z^{x-1}e^{-z}dz\) is the Gamma function. When \(\alpha >1\), the Gamma distribution has an increasing hazard rate, whilst the hazard rate for a Gamma distribution is decreasing when \(\alpha <1\). When \(\alpha =1\), the Gamma distribution reduces to an exponential distribution with a rate \(\lambda =\beta\). Since the family of Gamma distributions is rich and capable of reproducing a variety of distribution shapes, it is extensively used in lifetime modelling. The Gamma distribution is one of the potential lifetime distributions utilised to capture lifetime processes of humans and some other living things.
The Weibull distribution has two parameters: k and \(\lambda .\) Its associated pdf is
When \(k>1\), the Weibull distribution has an increasing hazard rate, whilst the hazard rate is decreasing when \(k<1\). A Weibull distribution where \(k=1\) reduces to an exponential distribution with rate \(\lambda\). The Weibull distribution is popular in survival analysis and affords a sound basis for modelling lifetime distributions.
The Pareto distribution has two parameters: k and \(\sigma\). Its associated survival function is
It is well-known that Pareto distributions are heavy-tailed.
A convolution of two exponential distributions has two parameters, and the random variable that represents a convolution of two exponential distributions is
where \(X_1\) and \(X_2\) follow the exponential distributions with rates \(\lambda _1\) and \(\lambda _2\), respectively. The convolutions of different distributions have a physical interpretation; i.e, the process could be decomposed into several components and the time spent on each component follows a certain distribution. Specifically, the physical interpretation for the convolution mechanism is consistent with our cognizance of ageing, being a progressive process. Furthermore, it is often assumed that the time spent in each component follows an exponential distribution because an exponential distribution matches most observed data. Therefore, a convolution of exponential distributions is a further extension capable of accomplishing more flexibility.
A convolution of two Weibull distributions has four parameters, and the random variable representing a convolution of two Weibull distributions is
where \(W_1\) and \(W_2\) follow the Weibull distribution with parameters \((\lambda _1, k_1)\) and \((\lambda _2, k_2)\) respectively. This is a simple extension of the previous example by changing the distributions of each component, similar to Huzurbazar (1999). Since the Weibull distribution is utilised in survival analysis, it is plausible for the ageing process that the time spent in each component follows a Weibull distribution as well. It is then worth exploring whether one could use the PTAM to approximate well a convolution of Weibull distributions.
The Gompertz-Makeham distribution has three parameters, and its associated hazard rate is
where \(\xi\) is the growth rate of mortality and \(\zeta\) represents the background mortality with additional constraint \(\zeta > 0\).
The Gompertz-Makeham distribution were proposed by Makeham (1860) and Gompertz (1825), and it assumes a shifted exponential increasing hazard rate with respect to age. Such a distribution fits human mortality rates from age 30 to 80 well. When we calibrate the PTAM with human lifetime observations only, the PTAM is essentially approximating the Gompertz-Makeham distribution.
The Makeham’s second extension of the Gompertz distribution has four parameters, and its survival function is
Makeham (1889) extended the assumptions on the differential equation for the hazard rate to higher-order derivatives. This was motivated by the observation that the third-order derivatives of the empirical hazard rates were much closer to a geometric progression. One may verify that the third-order derivatives, with respect to time t, of the hazard rate arising from the Makeham’s second extension of the Gompertz distribution is \(\xi \lambda ^3(\exp (\lambda ))^t\), which is a geometric progression with initial value \(\xi \lambda ^3\) and common ratio \(\exp (\lambda )\).
Once again, it is likely to observe human lifetimes following the extended Gompertz distribution. For each example, we select appropriate parameter values to achieve a particular shape of probability density function or hazard function. Once the parameter values are chosen, the target lifetime distributions are completely determined.
1.2 Appendix B.2: Calibration criterion
We shall use the Kullback-Leibler (KL) divergence (Kullback and Leibler 1951; Kullback 1997) to summarise the information loss when using the PTAM to approximate the target lifetime distribution, which is known. When calibrating the PTAM, we minimise the KL divergence to ensure minimal lost information.
Definition B.1
For continuous distributions P and Q with respective pdf’s p(x) and q(x) defined on a sample probability space, the KL divergence \(D_{KL}\) of Q from P is
Usually, the distribution P represents the actual model, and the distribution Q represents the distribution used to approximate P. In our case, the distribution P is the target lifetime distribution in each example, and the distribution Q is the PTAM.
The KL divergence’s non-negativity is a nice property as discussed further below. Specifically,
with equality if and only if \(P=Q\). This result is known as the Gibbs’ inequality (Gibbs 1902). Thus, the divergence value could be interpreted as the extra information required for Q to represent P. Therefore, the smaller the divergence, the less extra information is required when using Q to approximate P and the better goodness of fit is for Q. Furthermore, by its non-negative property, the closer the divergence value to 0, the better the approximation.
In our case, we calibrate the PTAM by minimising the KL divergence of the PTAM from the target lifetime distribution. The analytical formula for the PTAM’s resulting distribution is quite involved. However, the resulting distribution is fairly easy to evaluate numerically for a given set of parameter values. Therefore, we numerically search for the minimum of \(D_{KL}(\text {lifetime distribution} \ || \ \text {proposed PTAM})\), where p(x) is the pdf of the target lifetime distribution, q(x) is the pdf of the PTAM, and
The numerical search utilises the optimisation function fmincon in MATLAB. The calibration process is similar to the model calibration in the Le Bras model simulation in Cheng et al. (2021), except for switching the optimisation criterion from negative log-likelihood to the KL divergence. The Tail Value-at-Risk (TVaR) with risk level 0.999 denoted by \(\text {TVaR}_{0.999}(T)\) is calculated in each example, and the calculated value is used to estimate \(\psi\). The estimated values for the other four parameters are the values that minimise the divergence. For the fitted results, refer to Tables S.1 – S.10 and Figures S.1 – S.10 of the Supplementary Material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cheng, B., Mamon, R. Examining the identifiability and estimability of the phase-type ageing model. Comput Stat 39, 963–1004 (2024). https://doi.org/10.1007/s00180-023-01329-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-023-01329-5