Abstract
Finding the optimal design for a discrete event dynamic system (DEDS) is in general difficult due to the large search space and the simulation-based performance evaluation. Various heuristics have been developed to find good designs. An important question is how to quantify the goodness of the heuristic designs. Inspired by the Ordinal Optimization, which has become an important tool for optimizing DEDS, we provide a method which can quantify the goodness of the design. By comparing with a set of designs that are uniformly sampled, we measure the ordinal performances of heuristic designs, i.e., we quantify the ranks of all (or some of) the heuristic designs among all the designs in the entire search space. The mathematical tool we use is the Hypothesis Testing, and the probability of making Type II error in the quantification is controlled to be under a very low level. The method can be used both when the performances of the designs can be accurately evaluated and when such performances are estimated by a crude but computationally easy model. The method can quantify both heuristics that output a single design and that output a set of designs. The method is demonstrated through numerical examples.
Similar content being viewed by others
Notes
Sequential Probability Ratio Test (SPRT) can limit the two types of errors at the same time. This can be a future topic following the work in this paper.
Abbreviations
- Symbol:
-
Meaning
- Θ:
-
The search space
- θ :
-
An element of the search space
- N :
-
Set of designs uniformly sampled from Θ
- θ H :
-
A heuristic design
- \(\theta_{n\mbox{\scriptsize\%}}\) :
-
The design ranking exactly at top \(n\mbox{\%}\) of Θ
- θ N,i :
-
The i-th design in N
- J(·):
-
The true performance of a design
- \(\hat{J}(\cdot)\) :
-
Observed performance of a design
- \(\hat{J}(\theta_{N,[t]})\) :
-
The t-th order statistic of \(\hat{J}(\theta_{N,i})\), i = 1,2,...,|N|
- β 0 :
-
Bounding level for the probability of making the Type II error
- RΘ(·):
-
The rank of a design in Θ
References
Chen CH, Wu SD, Dai L (1999) Ordinal comparison of heuristic algorithms using stochastic optimization. IEEE Trans Robot Autom 15(1):44–56
Chen CH, Lin J, Yücesan E, Chick SE (2000a) Simulation budget allocation for further enhancing the efficiency of ordinal optimization. Discret Event Dyn Syst Theor Appl 10:251–270
Chen HC, Chen CH, Yücesan E (2000b) Computing efforts allocation for ordinal optimization and discrete event simulation. IEEE Trans Automat Contr 45(5):960–964
Dorigo M, Gambardella LM (1999) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Dorigo M, Di Caro G, Gambardella LM (1999) Ant algorithms for discrete optimization. Artif Life 5(2):137–172
Ho YC (1989) Introduction to special issue on dynamics of dynamics of discrete event systems. Proc IEEE 77(1):3–6
Ho YC (1999) An explanation of ordinal optimization: soft computing for hard problems. Inf Sci 113:169–192
Ho YC, Sreenivas R, Vakili P (1992) Ordinal optimization of discrete event dynamic systems. J DEDS 2(2):61–88
Ho YC, Zhao QC, Jia QS (2007) Ordinal optimization: soft optimization for hard problems. Springer, New York
Holland JH (1975) Adaptation in natural and artificial systems. University Michigan Press, Ann Arbor
Hopfield JJ, Tank DW (1985) “Neural” computation of decisions in optimization problems. Biol Cybern 52:141–152
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE int. conf. on neural networks, 1942–1948, Perth, Australia
Lau TWE, Ho YC (1997) Universal alignment probabilities and subset selection for ordinal optimization. J Optim Theory Appl 93(3):455–489
Lin SY, Ho YC (2002) Universal alignment probability revisited. J Optim Theory Appl 113(2):399–407
Mori H, Tani H (2003) A hybrid method of PTS and ordinal optimization for distribution system service restoration. In: IEEE international conference on systems, man and cybernetics, vol 4, pp 3476–3483
Pinedo M (2002) Scheduling theory, algorithms, and systems, 2nd edn. Prentice-Hall, Englewood Cliffs
Shen Z, Bai H-X, Zhao YJ (2005) Ordinal optimization references list. http://cfins.au.tsinghua.edu.cn/en/resource/index.php
Shen Z, Zhao QC, Jia QS, Sun J (2009a) Universal alignment probability revisited. J Optim Theory Appl 141:371–376
Shen Z, Zhao QC, Jia QS (2009b) Quantifying heuristics in the ordinal optimization framework. Technique report, Center for Intelligent and Networked Systems. http://www.cfins.au.tsinghua.edu.cn/files/paper/Quantify_JDEDS09_full.pdf
Shi LY, Chen CH (2000) A new algorithm for stochastic discrete resource allocation optimization. Discret Event Dyn Syst Theor Appl 10:271–294
Shi LY, Ólafsson S (2000) Nested partitions method for global optimization. Oper Res 48(3):390–407
Yen CH, Shan D, Wong H, Jang SS (2004) Solution of trim-loss problem by an integrated simulated annealing and ordinal optimization approach. J Intell Manuf 15:701–709
Acknowledgement
The authors would like to express their sincere thanks to Prof. Yu-Chi Ho of Harvard University & Tsinghua University. He initializes and guides the research in this paper. But of course, all remaining possible errors are solely the responsibility of the authors.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by NSFC under Grants (60574067, 60704008, 60721003, 60736027, and 90924001), the Specialized Research Fund for the Doctoral Program of Higher Education (20070003110), the Program of Introducing Talents of Discipline to Universities (National 111 International Collaboration Project, B06002) and the high-level graduate student scholarship 2007 of China Scholarship Council. Part of the content of the paper was developed by Zhen Shen when he visited Boston University. Zhen Shen thanks Boston University for the hospitality. And, the content of this paper has been included in Zhen Shen’s Ph.D. thesis at Tsinghua University.
Appendix A Proof of Theorem 1
Appendix A Proof of Theorem 1
Please refer to Section 4.2.1 for the presentation of Theorem 1. We will first prove the fundamental Eq. 31 of Theorem 1 in Section A.5 based on the results established in Sections A.1–A.4. Other results in Theorem 1 are all proved in Section A.6.
1.1 A.1 Maximum of the probability of making Type II error
In this section we will prove
We introduce the indicator function:
We have the following,
This holds no matter which design θ H is. By the total expectation formula, we have,
We have,
Since \(\hat{J}(\theta_H)=J(\theta_H)+W_H\), from Eq. 52,
Please pay attention to Eq. 53, it is a conditional probability, for a given x, when J(θ H ) becomes smaller, x − J(θ H ) becomes larger, the event W H < x − J(θ H ) is more likely to happen.
If we have two designs, θ H,1 and θ H,2, J(θ H,1) ≤ J(θ H,2), from Eq. 53 and the above analysis, we have,
where W H,1 and W H,2 are the noises when evaluating θ H,1 and θ H,2. They are I.I.D. Eq. 54 tells us, when J(θ H,1) ≤ J(θ H,2),
From Eqs. 50, 51 and 55, we have, when J(θ H,1) ≤ J(θ H,2),
Equation 56 tells us that, the smaller the true performance of the heuristic design θ H is, the higher is the probability that it is observed to be better than the t-th best design in N. Let us denote \(\theta_{n\mbox{\scriptsize\%}}\) as the design whose true rank in Θ is exactly top \(n\mbox{\%}\) of Θ. Thus, we have,
This is because, \(\theta_{n\mbox{\scriptsize\%}}\) is the best design of the designs with ranks not smaller than \(n{\mbox{\%}}\).
Thus, in order to limit the probability of making Type II error, we need and only need to consider the case when the probability of making Type II error is the largest. When t = 1, from Eq. 57,
1.2 A.2 An upper bound for \(P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]}\}\)
We will prove
where f W (x) and F W (x) are the p.d.f and the c.d.f of the noise W respectively. We have \(\hat{J}(\theta_{n\mbox{\scriptsize\%}})=J(\theta_{n\mbox{\scriptsize\%}})+W\). W is the noise when evaluating this design. We have
Then,
If \(\theta_{n\mbox{\scriptsize\%}}\) is observed better than \(\hat{J}(\theta_{N,[1]})\), it means that this design is observed better than any of the designs in N, i.e., it is observed better than \(\hat{J}(\theta_{N,i})=J(\theta_{N,i})+W_{N,i},\) i = 1,2,...,|N|. Since any θ N,i is uniformly sampled from the search space and W N,i is the I.I.D noise, \(\hat{J}(\theta_{N,i})\) is I.I.D. Thus Eq. 61 is transformed into
The last “=” in Eq. 62 holds because \(\hat{J}(\theta_{N,i})\) is I.I.D. Now let us focus on Eq. 62, for any i, we have
By the total probability formula, we have
In Eq. 64, we have,
The last “=” in Eq. 65 holds because the noise is independent from the true performance of a uniformly sampled design. For Eq. 65, we have
It is “\(\leq 1-n\mbox{\%}\)” not “\(=1-n{\mbox{\%}}\)” in Eq. 66 because OPC Θ may not be strictly increasing. For a strictly increasing OPC Θ, in (B30) “\(\leq 1-n\mbox{\%}\)” should be replaced by “\(=1-n{\mbox{\%}}\)”. This is because for strictly increasing OPC Θ, every design has a different performance. The heuristic design with rank at \(n{\mbox{\%}}\) is better than a uniformly sampled design when and only when the uniformly sampled design is sampled from designs with worse ranks than the heuristic design. For the not strictly increasing OPC Θ, the ranks are still from 1 to ∣ Θ ∣. Ties are broken arbitrarily and the designs with ties are given different but sequential ranks. Thus, there can be some designs having the same performance with \(\theta_{n\mbox{\scriptsize\%}}\) but with worse ranks than \(\theta_{n\mbox{\scriptsize\%}}\). This is the reason why it is “\(\leq 1-n\mbox{\%}\)” in Eq. 66.
From Eqs. 65 and 66, we know Eq. 64 can be changed to
Since any conditional probability cannot be larger than 1, we have
F W (x) is used to stand for the the c.d.f of the I.I.D noise. By Eqs. 67 and 68, we have
From Eqs. 62, 63 and 69, we have
As we can see, we have proved
1.3 A.3 An upper bound of \(\int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx\)
In this section, we shall prove that the right hand side (RHS) of Eq. 71 is no larger than \(\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}\), that is,
We rewrite Eq. 70 as follows, i.e., the integration is divided into two parts, separating by x 0. x 0 can be any real number,
For the first item in the RHS of Eq. 72, we have
For the second item in the RHS of Eq. 72, we have
From Eqs. 73 and 74, Eq. 72 can be changed to
Please recall that we are concerned with
Thus, from Eqs. 71 and 75, finally Eq. 76 can be bounded by
Please notice that, for any x 0, Eq. 77 holds. Thus, for any p ∈ [0, 1], we can derive from (B40) that
Thus, we have
Thus, we now change our problem to the problem of optimizing Eq. 79. If we can find the minimum of the expression shown in Eq. 79, we find the upper bound of the probability of making Type II error.
1.4 A.4 The convexity of the bound in the RHS of Eq. 79
In this section, we shall show \(p+(1-n\mbox{\%}p)^{|N|}(1-p)\) is a convex function of p in [0,1]. We denote,
We notice,
Thus, there must be a minimum and it appears within (0,1). Next, we prove the convexity of h(p) over (0, 1).
For |N| ≥ 2, we have
For N = 1, we have
Please see Eq. 83. The second item in the RHS is obviously larger than 0. And, in the “{}”,
Thus, the first item in the RHS of Eq. 83 is also larger than 0. Then we have for any integer |N|,
We finish proving the convexity of \(p+(1-n\mbox{\%}p)^{|N|}(1-p)\) for p ∈ [0,1].
1.5 A.5 Solution to \(\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\)
In Section A.3, we have proved Eq. 79, which is
Thus, if
happens, the following is guaranteed,
We know from Section A.4, there is one and only one minimum of h(p),p ∈ [0,1], and this minimum appears within (0,1).
We should solve Eq. 82, which is \(\frac{dh(p)}{dp}=0\) to obtain the p corresponding to the minimum. But it is difficult to obtain a closed form of p by Eq. 82.
There are at least two methods to address this difficulty. The first method is numerical calculation. Since we have Eq. 87, which guarantees the convexity, the gradient method can work very well. The second method is as follows.
What we want is \(\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}\leq \beta_0\). To make it hold, we only need to find a p ∈ [0,1] such that the following hold,
For a coefficient c, 0 < c < 1,
Since p = c×β 0 ∈ (0,1), we can substitute Eq. 89 into Eq. 88, we have
From Eq. 90, we obtain
Equation 91 means, if a heuristic design is observed to be better than any design in N, the heuristic design can be judged be within the top \(n\mbox{\%}=\min\limits_{0<c<1}\frac{1}{c\times\beta_0}\left\{ 1-\exp\left\{ \frac{\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{|N|} \right\} \right\}\) of the search space Θ, no matter what kind and how large the noise is. In doing so, the probability of the Type II error is not larger than the given level β 0. We finish the proof of Eq. 31. Solving Eq. 91 by numerical method, we obtain Table 3.
1.6 A.6 The quick formula and the closed form for \(n{\mbox{\%}}\)
In this section, we shall prove that,
This is Eq. 34 in Theorem 1. And we will also prove,
If the above is true, the heuristic design can be judged to be within top \(n{\mbox{\%}}\) with the following expression,
This is Eq. 36 in Theorem 1. Before we give the proof, we give a lemma.
Lemma 2
\(1-\exp(x)\leq -x\) for any real x. And by simple transformations, we have, ln (1 + x) ≤ x, x > − 1.
Proof
Let \(f(x)=\exp(x)-1-x\). We have
Thus, f(x) is convex. The only minimum appears at \(\frac{df(x)}{dx}=\exp(x)-1=0\), solve this, we obtain x = 0. Thus, the minimum of f(x) is, \(f(0) =\exp(0)-1-0= 0\). We then have
Then, \(\exp(x)\geq 1+x\). Taking logarithm at both sides, we have, ln (1 + x) ≤ x, x > − 1.□
Apply Lemma 2 to Eq. 91. For \(x=\frac{\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{|N|}\) we can easily establish
So, we can judge \(n\mbox{\%}\) to be
Given β 0, by the numerical method, we can find the c that makes \(\min\limits_{0<c<1} \frac{-\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0}\) achieve its minimum. We find out that when β 0 = 0.05, the c that makes \(\min\limits_{0<c<1} \frac{-\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0}\) achieve minimum is very near to c = 0.826. When c = 0.826, from Eq. 93, we have
To go further, we pay attention to the numerator of the RHS of Eq. 92,
We have,
From Eq. 96, Eq. 95 can be changed to
By Lemma 2, ln (1 + x) ≤ x, x > − 1, we have for the second item of the RHS of Eq. 97
From Eqs. 92, 95, 97 and 98, we have
We denote
We want to find the minimum of this function. We have
Thus, f(c) is convex for 0 < c < 1. This is one and only one minimum for 0 < c < 1. The minimum appears when
Solve it, we obtain
Substitute this c into f(c), we obtain
based on which Eq. 36 can be easily derived.
Rights and permissions
About this article
Cite this article
Shen, Z., Zhao, QC. & Jia, QS. Quantifying Heuristics in the Ordinal Optimization Framework . Discrete Event Dyn Syst 20, 441–471 (2010). https://doi.org/10.1007/s10626-009-0087-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10626-009-0087-2