Quantifying Heuristics in the Ordinal Optimization Framework

Shen, Zhen; Zhao, Qian-Chuan; Jia, Qing-Shan

doi:10.1007/s10626-009-0087-2

Quantifying Heuristics in the Ordinal Optimization Framework

Published: 20 November 2009

Volume 20, pages 441–471, (2010)
Cite this article

Discrete Event Dynamic Systems Aims and scope Submit manuscript

Zhen Shen^1,2,
Qian-Chuan Zhao² &
Qing-Shan Jia²

210 Accesses
5 Citations
Explore all metrics

Abstract

Finding the optimal design for a discrete event dynamic system (DEDS) is in general difficult due to the large search space and the simulation-based performance evaluation. Various heuristics have been developed to find good designs. An important question is how to quantify the goodness of the heuristic designs. Inspired by the Ordinal Optimization, which has become an important tool for optimizing DEDS, we provide a method which can quantify the goodness of the design. By comparing with a set of designs that are uniformly sampled, we measure the ordinal performances of heuristic designs, i.e., we quantify the ranks of all (or some of) the heuristic designs among all the designs in the entire search space. The mathematical tool we use is the Hypothesis Testing, and the probability of making Type II error in the quantification is controlled to be under a very low level. The method can be used both when the performances of the designs can be accurately evaluated and when such performances are estimated by a crude but computationally easy model. The method can quantify both heuristics that output a single design and that output a set of designs. The method is demonstrated through numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Using Metaheuristics in Discrete-Event Simulation

Evaluating the quality of online optimization algorithms by discrete event simulation

Article 14 October 2016

Design as a sequential decision process

Article 13 July 2017

Notes

Sequential Probability Ratio Test (SPRT) can limit the two types of errors at the same time. This can be a future topic following the work in this paper.

Abbreviations

Symbol:: Meaning
Θ:: The search space
θ :: An element of the search space
N :: Set of designs uniformly sampled from Θ
θ _H :: A heuristic design
$\theta_{n\mbox{\scriptsize\%}}$ :: The design ranking exactly at top $n\mbox{\%}$ of Θ
θ _N,i :: The i-th design in N
J(·):: The true performance of a design
$\hat{J}(\cdot)$ :: Observed performance of a design
$\hat{J}(\theta_{N,[t]})$ :: The t-th order statistic of $\hat{J}(\theta_{N,i})$, i = 1,2,...,|N|
β ₀ :: Bounding level for the probability of making the Type II error
R_Θ(·):: The rank of a design in Θ

References

Chen CH, Wu SD, Dai L (1999) Ordinal comparison of heuristic algorithms using stochastic optimization. IEEE Trans Robot Autom 15(1):44–56
Article Google Scholar
Chen CH, Lin J, Yücesan E, Chick SE (2000a) Simulation budget allocation for further enhancing the efficiency of ordinal optimization. Discret Event Dyn Syst Theor Appl 10:251–270
Article Google Scholar
Chen HC, Chen CH, Yücesan E (2000b) Computing efforts allocation for ordinal optimization and discrete event simulation. IEEE Trans Automat Contr 45(5):960–964
Article Google Scholar
Dorigo M, Gambardella LM (1999) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evol Comput 1(1):53–66
Article Google Scholar
Dorigo M, Di Caro G, Gambardella LM (1999) Ant algorithms for discrete optimization. Artif Life 5(2):137–172
Article Google Scholar
Ho YC (1989) Introduction to special issue on dynamics of dynamics of discrete event systems. Proc IEEE 77(1):3–6
Article Google Scholar
Ho YC (1999) An explanation of ordinal optimization: soft computing for hard problems. Inf Sci 113:169–192
Article Google Scholar
Ho YC, Sreenivas R, Vakili P (1992) Ordinal optimization of discrete event dynamic systems. J DEDS 2(2):61–88
Google Scholar
Ho YC, Zhao QC, Jia QS (2007) Ordinal optimization: soft optimization for hard problems. Springer, New York
Google Scholar
Holland JH (1975) Adaptation in natural and artificial systems. University Michigan Press, Ann Arbor
Google Scholar
Hopfield JJ, Tank DW (1985) “Neural” computation of decisions in optimization problems. Biol Cybern 52:141–152
MathSciNet Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE int. conf. on neural networks, 1942–1948, Perth, Australia
Google Scholar
Lau TWE, Ho YC (1997) Universal alignment probabilities and subset selection for ordinal optimization. J Optim Theory Appl 93(3):455–489
Article MathSciNet Google Scholar
Lin SY, Ho YC (2002) Universal alignment probability revisited. J Optim Theory Appl 113(2):399–407
Article MathSciNet Google Scholar
Mori H, Tani H (2003) A hybrid method of PTS and ordinal optimization for distribution system service restoration. In: IEEE international conference on systems, man and cybernetics, vol 4, pp 3476–3483
Pinedo M (2002) Scheduling theory, algorithms, and systems, 2nd edn. Prentice-Hall, Englewood Cliffs
Google Scholar
Shen Z, Bai H-X, Zhao YJ (2005) Ordinal optimization references list. http://cfins.au.tsinghua.edu.cn/en/resource/index.php
Shen Z, Zhao QC, Jia QS, Sun J (2009a) Universal alignment probability revisited. J Optim Theory Appl 141:371–376
Article MathSciNet Google Scholar
Shen Z, Zhao QC, Jia QS (2009b) Quantifying heuristics in the ordinal optimization framework. Technique report, Center for Intelligent and Networked Systems. http://www.cfins.au.tsinghua.edu.cn/files/paper/Quantify_JDEDS09_full.pdf
Shi LY, Chen CH (2000) A new algorithm for stochastic discrete resource allocation optimization. Discret Event Dyn Syst Theor Appl 10:271–294
Article MathSciNet Google Scholar
Shi LY, Ólafsson S (2000) Nested partitions method for global optimization. Oper Res 48(3):390–407
Article MathSciNet Google Scholar
Yen CH, Shan D, Wong H, Jang SS (2004) Solution of trim-loss problem by an integrated simulated annealing and ordinal optimization approach. J Intell Manuf 15:701–709
Article Google Scholar

Download references

Acknowledgement

The authors would like to express their sincere thanks to Prof. Yu-Chi Ho of Harvard University & Tsinghua University. He initializes and guides the research in this paper. But of course, all remaining possible errors are solely the responsibility of the authors.

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Zhen Shen
Center for Intelligent and Networked Systems (CFINS), Department of Automation, TNLIST, Tsinghua University, Beijing, 100084, China
Zhen Shen, Qian-Chuan Zhao & Qing-Shan Jia

Authors

Zhen Shen
View author publications
You can also search for this author in PubMed Google Scholar
Qian-Chuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Qing-Shan Jia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhen Shen.

Additional information

This work was supported by NSFC under Grants (60574067, 60704008, 60721003, 60736027, and 90924001), the Specialized Research Fund for the Doctoral Program of Higher Education (20070003110), the Program of Introducing Talents of Discipline to Universities (National 111 International Collaboration Project, B06002) and the high-level graduate student scholarship 2007 of China Scholarship Council. Part of the content of the paper was developed by Zhen Shen when he visited Boston University. Zhen Shen thanks Boston University for the hospitality. And, the content of this paper has been included in Zhen Shen’s Ph.D. thesis at Tsinghua University.

Appendix A Proof of Theorem 1

Please refer to Section 4.2.1 for the presentation of Theorem 1. We will first prove the fundamental Eq. 31 of Theorem 1 in Section A.5 based on the results established in Sections A.1–A.4. Other results in Theorem 1 are all proved in Section A.6.

1.1 A.1 Maximum of the probability of making Type II error

In this section we will prove

$$\begin{array}{lll} \label{eqB8} &&P\big\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})|R_\Theta(\theta_H)\geq n\mbox{\%}\times |\Theta|\big\}\\ &&{\kern11pt} \leq P \big\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})|R_\Theta(\theta_H)= n\mbox{\%}\times |\Theta|\big\} \end{array}. $$

(48)

We introduce the indicator function:

$$ \label{eqB13} I_{E}=\left\{\begin{array}{cl} 1, & {\rm when} \, E \,{\rm happens,}\\ 0, & {\rm otherwise}. \end{array} \right. $$

(49)

We have the following,

$$ \label{eqB14} P\big\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\big\} =EI_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\}}. $$

(50)

This holds no matter which design θ _H is. By the total expectation formula, we have,

$$ \label{eqB15} EI_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})}\} =E\left(E\left(I_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})\right)\right). $$

(51)

We have,

$$ \label{eqB16} E\left(I_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})\right) =E\left(I_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})=x\right). $$

(52)

Since $\hat{J}(\theta_H)=J(\theta_H)+W_H$, from Eq. 52,

$$ \label{eqB17}\begin{array}{cl} E\big(I_{\{\hat{J}(\theta_H)<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})=x\big) &=E(I_{\{J(\theta_H)+W_H<x\}}|\hat{J}(\theta_{N,[t]})=x) \\ & =E(I_{\{W_H<x-J(\theta_H)\}}|\hat{J}(\theta_{N,[t]})=x). \end{array} $$

(53)

Please pay attention to Eq. 53, it is a conditional probability, for a given x, when J(θ _H) becomes smaller, x − J(θ _H) becomes larger, the event W _H < x − J(θ _H) is more likely to happen.

If we have two designs, θ _H,1 and θ _H,2, J(θ _H,1) ≤ J(θ _H,2), from Eq. 53 and the above analysis, we have,

$$\begin{array}{rcl} \label{eqB18} E\big(I_{\{\hat{J}(\theta_{H,1})<x\}}|\hat{J}(\theta_{N,[t]})=x\big) &=&E\big(I_{\{W_{H,1}<x-J(\theta_{H,1})\}}|\hat{J}(\theta_{N,[t]})=x\big)\\ &\geq& E\big(I_{\{W_{H,2}<x-J(\theta_{H,2})\}}|\hat{J}(\theta_{N,[t]})=x\big)\\ &=& E\big(I_{\{\hat{J}(\theta_{H,2})<x\}}|\hat{J}(\theta_{N,[t]})=x\big). \end{array} $$

(54)

where W _H,1 and W _H,2 are the noises when evaluating θ _H,1 and θ _H,2. They are I.I.D. Eq. 54 tells us, when J(θ _H,1) ≤ J(θ _H,2),

$$ \label{eqB19} E\big(I_{\{\hat{J}(\theta_{H,1})<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})=x\big) \geq E\big(I_{\{\hat{J}(\theta_{H,2})<\hat{J}(\theta_{N,[t]})\}}|\hat{J}(\theta_{N,[t]})=x\big). $$

(55)

From Eqs. 50, 51 and 55, we have, when J(θ _H,1) ≤ J(θ _H,2),

$$ \label{eqB21} P\big\{\hat{J}(\theta_{H,1})<\hat{J}(\theta_{N,[t]})\big\} \geq P\big\{\hat{J}(\theta_{H,2})<\hat{J}(\theta_{N,[t]})\big\}. $$

(56)

Equation 56 tells us that, the smaller the true performance of the heuristic design θ _H is, the higher is the probability that it is observed to be better than the t-th best design in N. Let us denote $\theta_{n\mbox{\scriptsize\%}}$ as the design whose true rank in Θ is exactly top $n\mbox{\%}$ of Θ. Thus, we have,

$$ \label{eqB22}\begin{array}[b]{cl} P\{D_0|H_1\}& =P\big\{\hat{J}(\theta_{H})<\hat{J}(\theta_{N,[t]}) |R_{\Theta}(\theta_H)\geq n\mbox{\%}\times |\Theta |\big\}\\ & \leq P\big\{\hat{J}(\theta_{H})<\hat{J}(\theta_{N,[t]}) |R_{\Theta}(\theta_H)= n\mbox{\%}\times |\Theta |\big\}\\ &=P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[t]}). \end{array} $$

(57)

This is because, $\theta_{n\mbox{\scriptsize\%}}$ is the best design of the designs with ranks not smaller than $n{\mbox{\%}}$.

Thus, in order to limit the probability of making Type II error, we need and only need to consider the case when the probability of making Type II error is the largest. When t = 1, from Eq. 57,

$$ \label{eqB23} P\{D_0|H_1\}\leq P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} \leq \beta_0. $$

(58)

1.2 A.2 An upper bound for $P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]}\}$

We will prove

$$ \label{eqB} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx $$

(59)

where f _W(x) and F _W(x) are the p.d.f and the c.d.f of the noise W respectively. We have $\hat{J}(\theta_{n\mbox{\scriptsize\%}})=J(\theta_{n\mbox{\scriptsize\%}})+W$. W is the noise when evaluating this design. We have

$$ \label{eqB24} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} =P\big\{J(\theta_{n\mbox{\scriptsize\%}})+W<\hat{J}(\theta_{N,[1]})\big\} $$

(60)

Then,

$$ \label{eqB25} P\big\{J(\theta_{n\mbox{\scriptsize\%}})+W<\hat{J}(\theta_{N,[1]})\big\} =\int_{-\infty}^{\infty}P\big\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\big\}f_W(x)dx. $$

(61)

If $\theta_{n\mbox{\scriptsize\%}}$ is observed better than $\hat{J}(\theta_{N,[1]})$, it means that this design is observed better than any of the designs in N, i.e., it is observed better than $\hat{J}(\theta_{N,i})=J(\theta_{N,i})+W_{N,i},$ i = 1,2,...,|N|. Since any θ _N,i is uniformly sampled from the search space and W _N,i is the I.I.D noise, $\hat{J}(\theta_{N,i})$ is I.I.D. Thus Eq. 61 is transformed into

$$ \label{eqB26} \begin{array}{l} \int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\}f_W(x)dx \\ =\int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,1}),\dots, J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,|N|})\}f_W(x)dx\\ =\int_{-\infty}^{\infty}\prod\limits_{i=1}^{|N|}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,i})\}f_W(x)dx\\ =\int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,i})\}^{|N|}f_W(x)dx, \; \forall i. \end{array} $$

(62)

The last “=” in Eq. 62 holds because $\hat{J}(\theta_{N,i})$ is I.I.D. Now let us focus on Eq. 62, for any i, we have

$$ \label{eqB27}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,i})\} =P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}\}. $$

(63)

By the total probability formula, we have

$$ \label{eqB28}\begin{array}{l} P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}\} \\ =P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}<x\}P\{W_{N,i}<x\} \\ \quad +P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}\geq x\}P\{W_{N,i}\geq x\}. \end{array} $$

(64)

In Eq. 64, we have,

$$ \label{eqB29}\begin{array}{l} P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}<x\} \\ =P\{J(\theta_{n\mbox{\scriptsize\%}})-J(\theta_{N,i})<W_{N,i}-x|W_{N,i}<x\}\\ \leq P\{J(\theta_{n\mbox{\scriptsize\%}})-J(\theta_{N,i})<0|W_{N,i}<x\}\\ =P\{J(\theta_{n\mbox{\scriptsize\%}})-J(\theta_{N,i})<0\}. \end{array} $$

(65)

The last “=” in Eq. 65 holds because the noise is independent from the true performance of a uniformly sampled design. For Eq. 65, we have

$$ \label{eqB30} P\{J(\theta_{n\mbox{\scriptsize\%}})-J(\theta_{N,i})<0\}\leq 1-n\mbox{\%}. $$

(66)

It is “$\leq 1-n\mbox{\%}$” not “$=1-n{\mbox{\%}}$” in Eq. 66 because OPC _Θ may not be strictly increasing. For a strictly increasing OPC _Θ, in (B30) “$\leq 1-n\mbox{\%}$” should be replaced by “$=1-n{\mbox{\%}}$”. This is because for strictly increasing OPC _Θ, every design has a different performance. The heuristic design with rank at $n{\mbox{\%}}$ is better than a uniformly sampled design when and only when the uniformly sampled design is sampled from designs with worse ranks than the heuristic design. For the not strictly increasing OPC _Θ, the ranks are still from 1 to ∣ Θ ∣. Ties are broken arbitrarily and the designs with ties are given different but sequential ranks. Thus, there can be some designs having the same performance with $\theta_{n\mbox{\scriptsize\%}}$ but with worse ranks than $\theta_{n\mbox{\scriptsize\%}}$. This is the reason why it is “$\leq 1-n\mbox{\%}$” in Eq. 66.

From Eqs. 65 and 66, we know Eq. 64 can be changed to

$$ \label{eqB31}\begin{array}{l} P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}\} \\ =P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}<x\}P\{W_{N,i}<x\} \\ \quad +P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}\geq x\}P\{W_{N,i}\geq x\} \\ \leq (1-n\mbox{\%})P\{W_{N,i}<x\}+P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}\geq x\}P\{W_{N,i}\geq x\}.\\[3pt] \end{array} $$

(67)

Since any conditional probability cannot be larger than 1, we have

$$ \label{eqB32} P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}|W_{N,i}\geq x\}\leq 1. $$

(68)

F _W(x) is used to stand for the the c.d.f of the I.I.D noise. By Eqs. 67 and 68, we have

$$ \label{eqB33}\begin{array}{l} P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}\} \\ \leq (1-n\mbox{\%})P\{W_{N,i}<x\}+ P\{W_{N,i}\geq x\}\\ =(1-n\mbox{\%})F_W(x)+ 1-F_W(x) \\ =1-n\mbox{\%}F_W(x). \end{array} $$

(69)

From Eqs. 62, 63 and 69, we have

$$ \label{eqB34} \begin{array}{l} \int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\}f_W(x)dx \\ =\int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,i})\}^{|N|}f_W(x)dx \; ({\rm due \; to \; }(\rm 62)\\ =\int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<J(\theta_{N,i})+W_{N,i}\}^{|N|}f_W(x)dx\; ({\rm due \; to \; }(\rm 63))\\ \leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx \; ({\rm due \; to \; }(\rm 69)). \end{array} $$

(70)

As we can see, we have proved

$$ \label{eqB35} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} \leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx. $$

(71)

1.3 A.3 An upper bound of $\int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx$

In this section, we shall prove that the right hand side (RHS) of Eq. 71 is no larger than $\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}$, that is,

$$ \begin{array}{l} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\}\leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx\\[3pt] \leq \min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}. \end{array} $$

We rewrite Eq. 70 as follows, i.e., the integration is divided into two parts, separating by x ₀. x ₀ can be any real number,

$$ \label{eqB36}\begin{array}{l} \int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\}f_W(x)dx \\ \leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx \\ = \int_{-\infty}^{x_0}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx+\int_{x_0}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx.\\ \end{array} $$

(72)

For the first item in the RHS of Eq. 72, we have

$$ \label{eqB37} F_W(x)\geq 0, \quad x\in (-\infty,x_0). $$

(73)

For the second item in the RHS of Eq. 72, we have

$$ \label{eqB38} F_W(x)\geq F_W(x_0), \quad x\in [x_0,+\infty). $$

(74)

From Eqs. 73 and 74, Eq. 72 can be changed to

$$ \label{eqB39} \begin{array}{l} \int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\}f_W(x)dx \\ \leq \int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx \\ \leq \int_{-\infty}^{x_0}f_W(x)dx+\int_{x_0}^{\infty}(1-n\mbox{\%}F_W(x_0))^{|N|}f_W(x)dx\\ =\int_{-\infty}^{x_0}f_W(x)dx+(1-n\mbox{\%}F_W(x_0))^{|N|}\int_{x_0}^{\infty}f_W(x)dx\\ =F_W(x_0)+(1-n\mbox{\%}F_W(x_0))^{|N|}\times(1-F_W(x_0)). \end{array} $$

(75)

Please recall that we are concerned with

$$ \label{eqB11} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\}= \int_{-\infty}^{\infty}P\{J(\theta_{n\mbox{\scriptsize\%}})+x<\hat{J}(\theta_{N,[1]})\}f_W(x)dx. $$

(76)

Thus, from Eqs. 71 and 75, finally Eq. 76 can be bounded by

$$ \label{eqB40}\begin{array}{l} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})\}<\hat{J}(\theta_{N,[1]}) \\ \leq F_W(x_0)+(1-n\mbox{\%}F_W(x_0))^{|N|}\times(1-F_W(x_0)),\; \forall x_0\in (-\infty,\infty). \end{array} $$

(77)

Please notice that, for any x ₀, Eq. 77 holds. Thus, for any p ∈ [0, 1], we can derive from (B40) that

$$ \label{eqB41} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} \leq p+(1-n\mbox{\%}p)^{|N|}\times(1-p). $$

(78)

Thus, we have

$$ \label{eqB42} P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} \leq \min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}. $$

(79)

Thus, we now change our problem to the problem of optimizing Eq. 79. If we can find the minimum of the expression shown in Eq. 79, we find the upper bound of the probability of making Type II error.

1.4 A.4 The convexity of the bound in the RHS of Eq. 79

In this section, we shall show $p+(1-n\mbox{\%}p)^{|N|}(1-p)$ is a convex function of p in [0,1]. We denote,

$$ \label{eqB43} h(p)=p+(1-n\mbox{\%}p)^{|N|}(1-p), \; p\in[0,1]. $$

(80)

We notice,

$$ \label{eqB44} \begin{array}{l} h(0)=1, \\ h(1)=1, \\ h(p)=p+(1-n\mbox{\%}p)^{|N|}(1-p)<p+1\times (1-p), \;p\in(0,1). \end{array} $$

(81)

Thus, there must be a minimum and it appears within (0,1). Next, we prove the convexity of h(p) over (0, 1).

For |N| ≥ 2, we have

$$ \label{eqB45} \frac{dh(p)}{dp}=1+(1-n\mbox{\%}p)^{|N|-1}(-|N|n\mbox{\%}+|N|n\mbox{\%}\times p-1+n\mbox{\%}\times p), $$

(82)

$$\begin{array}{lll} \label{eqB46} \displaystyle\frac{d^2h(p)}{dp^2}&=&(|N|-1)(1-n\mbox{\%}p)^{|N|-2}(-n\mbox{\%})\{-|N|n\mbox{\%}+|N|n\mbox{\%}\times p-1+n\mbox{\%}\times p\}\\ &&+ (1-n\mbox{\%}p)^{|N|-1}\times(|N|n\mbox{\%}+n\mbox{\%}). \end{array} $$

(83)

For N = 1, we have

$$ \label{eqB47} \frac{dh(p)}{dp}=1+(-|N|n\mbox{\%}+|N|n\mbox{\%}\times p-1+n\mbox{\%}\times p)=n\mbox{\%}\times(2p-1), $$

(84)

$$ \label{eqB48}\frac{d^2h(p)}{dp^2}=2n\mbox{\%}>0. $$

(85)

Please see Eq. 83. The second item in the RHS is obviously larger than 0. And, in the “{}”,

$$ \label{eqB49}-|N|n\mbox{\%}\!+\!|N|n\mbox{\%}\times p\!-\!1\!+\!n\mbox{\%}\!\times\! p=-|N|n\mbox{\%}\times(1\!-\!p)\!-\!(1\!-\!n\mbox{\%}\times p)\!<\!0. $$

(86)

Thus, the first item in the RHS of Eq. 83 is also larger than 0. Then we have for any integer |N|,

$$ \label{eqB50}\frac{d^2h(p)}{dp^2}>0, p\in [0,1]. $$

(87)

We finish proving the convexity of $p+(1-n\mbox{\%}p)^{|N|}(1-p)$ for p ∈ [0,1].

1.5 A.5 Solution to $\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)$

In Section A.3, we have proved Eq. 79, which is

$$ P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\} \leq \min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}. $$

Thus, if

$$ \min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}\leq \beta_0 $$

happens, the following is guaranteed,

$$ P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]})\}\leq \beta_0. $$

We know from Section A.4, there is one and only one minimum of h(p),p ∈ [0,1], and this minimum appears within (0,1).

We should solve Eq. 82, which is $\frac{dh(p)}{dp}=0$ to obtain the p corresponding to the minimum. But it is difficult to obtain a closed form of p by Eq. 82.

There are at least two methods to address this difficulty. The first method is numerical calculation. Since we have Eq. 87, which guarantees the convexity, the gradient method can work very well. The second method is as follows.

What we want is $\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\}\leq \beta_0$. To make it hold, we only need to find a p ∈ [0,1] such that the following hold,

$$ \label{eqB51} p+(1-n\mbox{\%}p)^{|N|}(1-p)\leq \beta_0. $$

(88)

For a coefficient c, 0 < c < 1,

$$ \label{eqB52} p=c\times \beta_0. $$

(89)

Since p = c×β ₀ ∈ (0,1), we can substitute Eq. 89 into Eq. 88, we have

$$ \label{eqB53}c\times\beta_0+(1-n\mbox{\%}c\times\beta_0)^{|N|}(1-c\times\beta_0)\leq \beta_0. $$

(90)

From Eq. 90, we obtain

$$ \label{eqB54} n\mbox{\%}\geq \frac{1}{c\times\beta_0}\left\{ 1-\exp\left\{ \frac{\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{|N|} \right\} \right\}, \; \forall c, 0<c<1. $$

(91)

Equation 91 means, if a heuristic design is observed to be better than any design in N, the heuristic design can be judged be within the top $n\mbox{\%}=\min\limits_{0<c<1}\frac{1}{c\times\beta_0}\left\{ 1-\exp\left\{ \frac{\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{|N|} \right\} \right\}$ of the search space Θ, no matter what kind and how large the noise is. In doing so, the probability of the Type II error is not larger than the given level β ₀. We finish the proof of Eq. 31. Solving Eq. 91 by numerical method, we obtain Table 3.

1.6 A.6 The quick formula and the closed form for $n{\mbox{\%}}$

In this section, we shall prove that,

$$ n\mbox{\%}=\frac{113.9}{|N|},\; \beta_0=0.05. $$

This is Eq. 34 in Theorem 1. And we will also prove,

$$ \min\limits_{0<c<1}\dfrac{1}{c\times \beta_0}\left\{ 1-\exp\left\{\dfrac{\ln\left(\dfrac{(1-c)\beta_0}{1-c\beta_0}\right)}{|N|}\right\} \right\} \leq \dfrac{1}{\beta_0|N|}\left(1+\sqrt{\ln\left(\dfrac{1}{\beta_0}\right)}\right)^2. $$

If the above is true, the heuristic design can be judged to be within top $n{\mbox{\%}}$ with the following expression,

$$ n{\mbox{\%}}=\dfrac{1}{\beta_0|N|}\left(1+\sqrt{\ln\left(\dfrac{1}{\beta_0}\right)}\right)^2. $$

This is Eq. 36 in Theorem 1. Before we give the proof, we give a lemma.

Lemma 2

$1-\exp(x)\leq -x$ for any real x. And by simple transformations, we have, ln (1 + x) ≤ x, x > − 1.

Proof

Let $f(x)=\exp(x)-1-x$. We have

$$ \frac{df(x)}{dx}=\exp(x)-1, \; \frac{d^2f(x)}{dx^2}=\exp(x)>0. $$

Thus, f(x) is convex. The only minimum appears at $\frac{df(x)}{dx}=\exp(x)-1=0$, solve this, we obtain x = 0. Thus, the minimum of f(x) is, $f(0) =\exp(0)-1-0= 0$. We then have

$$ f(x)=\exp(x)-1-x\geq 0. $$

Then, $\exp(x)\geq 1+x$. Taking logarithm at both sides, we have, ln (1 + x) ≤ x, x > − 1.□

Apply Lemma 2 to Eq. 91. For $x=\frac{\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{|N|}$ we can easily establish

$$ \label{eqB3} \min\limits_{0<c<1}\dfrac{1}{c\times \beta_0}\left\{ 1-\exp(x)\right\} \leq \min\limits_{0<c<1}\dfrac{1}{c\times \beta_0}\left\{-x\right\} =\min\limits_{0<c<1} \dfrac{-\ln\left( \dfrac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0|N|}. $$

(92)

So, we can judge $n\mbox{\%}$ to be

$$ \label{eqB57} n\mbox{\%}=\dfrac{1}{|N|}\min\limits_{0<c<1} \dfrac{-\ln\left( \dfrac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0}. $$

(93)

Given β ₀, by the numerical method, we can find the c that makes $\min\limits_{0<c<1} \frac{-\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0}$ achieve its minimum. We find out that when β ₀ = 0.05, the c that makes $\min\limits_{0<c<1} \frac{-\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)}{c\times \beta_0}$ achieve minimum is very near to c = 0.826. When c = 0.826, from Eq. 93, we have

$$ \label{eqB58} n\mbox{\%}=\frac{113.9}{|N|},\; \beta_0=0.05. $$

(94)

To go further, we pay attention to the numerator of the RHS of Eq. 92,

$$ \label{eqB59} -\ln\left( \frac{(1-c)\beta_0}{1-c\beta_0} \right)=\ln\left( \frac{1-c\beta_0}{(1-c)\beta_0} \right). $$

(95)

We have,

$$ \label{eqB60} 0<\frac{1-c\beta_0}{(1-c)\beta_0}\leq \frac{1}{(1-c)\beta_0}. $$

(96)

From Eq. 96, Eq. 95 can be changed to

$$\begin{array}{lll} \label{eqB61} \ln\left( \dfrac{1-c\beta_0}{(1-c)\beta_0} \right)&\leq& \ln\left( \dfrac{1}{(1-c)\beta_0} \right)\\[3pt]&=&\ln\left(\dfrac{1}{\beta_0}\right)+\ln\left(\dfrac{1}{1-c}\right) =\ln\left(\dfrac{1}{\beta_0}\right)+\ln\left(1+\frac{c}{1-c}\right). \end{array} $$

(97)

By Lemma 2, ln (1 + x) ≤ x, x > − 1, we have for the second item of the RHS of Eq. 97

$$ \label{eqB62} \ln\left(1+\frac{c}{1-c}\right)\leq \frac{c}{1-c}. $$

(98)

From Eqs. 92, 95, 97 and 98, we have

$$\begin{array}{lll} \label{eqB63} \min\limits_{0<c<1}\dfrac{1}{c\times \beta_0}\left\{ 1\!-\!\exp\left\{\dfrac{\ln\left(\dfrac{(1\!-\!c)\beta_0}{1\!-\!c\beta_0}\right)}{|N|}\right\} \right\} &\leq& \min\limits_{0<c<1}\dfrac{1}{c\times \beta_0|N|}\left(\ln\left(\dfrac{1}{\beta_0}\right)\!+\!\dfrac{c}{1\!-\!c}\right)\\ &=&\min\limits_{0<c<1}\dfrac{1}{ \beta_0|N|}\left(\dfrac{\ln\left(\dfrac{1}{\beta_0}\right)}{c}\!+\!\dfrac{1}{1\!-\!c}\right). \end{array} $$

(99)

We denote

$$ \label{eqB65} f(c)=\frac{\ln\left(\dfrac{1}{\beta_0}\right)}{c}+\dfrac{1}{1-c}. $$

(100)

We want to find the minimum of this function. We have

$$ \label{eqB66} \frac{df(c)}{dc}=\frac{-\ln\left(\dfrac{1}{\beta_0}\right)}{c^2}+\dfrac{1}{(1-c)^2}. $$

(101)

$$ \label{eqB67} \frac{d^2f(c)}{dc^2}=\frac{2\ln\left(\dfrac{1}{\beta_0}\right)}{c^3}+\dfrac{2}{(1-c)^3}>0. $$

(102)

Thus, f(c) is convex for 0 < c < 1. This is one and only one minimum for 0 < c < 1. The minimum appears when

$$ \label{eqB68} \frac{df(c)}{dc}=0. $$

(103)

Solve it, we obtain

$$ \label{eqB69}c=\frac{\sqrt{\ln\left(\dfrac{1}{\beta_0}\right)}}{1+\sqrt{\ln\left(\dfrac{1}{\beta_0}\right)}}. $$

(104)

Substitute this c into f(c), we obtain

$$ \label{eqB70}\min\limits_{0<c<1}f(c)=\left(1+\sqrt{\ln\left(\frac{1}{\beta_0}\right)}\right)^2, $$

(105)

based on which Eq. 36 can be easily derived.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, Z., Zhao, QC. & Jia, QS. Quantifying Heuristics in the Ordinal Optimization Framework . Discrete Event Dyn Syst 20, 441–471 (2010). https://doi.org/10.1007/s10626-009-0087-2

Download citation

Received: 17 March 2008
Accepted: 26 October 2009
Published: 20 November 2009
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10626-009-0087-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Quantifying Heuristics in the Ordinal Optimization Framework

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Using Metaheuristics in Discrete-Event Simulation

Evaluating the quality of online optimization algorithms by discrete event simulation

Design as a sequential decision process

Notes

Abbreviations

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix A Proof of Theorem 1

1.1 A.1 Maximum of the probability of making Type II error

1.2 A.2 An upper bound for \(P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]}\}\)

1.3 A.3 An upper bound of \(\int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx\)

1.4 A.4 The convexity of the bound in the RHS of Eq. 79

1.5 A.5 Solution to \(\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\)

1.6 A.6 The quick formula and the closed form for \(n{\mbox{\%}}\)

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Quantifying Heuristics in the Ordinal Optimization Framework

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Using Metaheuristics in Discrete-Event Simulation

Evaluating the quality of online optimization algorithms by discrete event simulation

Design as a sequential decision process

Notes

Abbreviations

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix A Proof of Theorem 1

Appendix A Proof of Theorem 1

1.1 A.1 Maximum of the probability of making Type II error

1.2 A.2 An upper bound for \(P\{\hat{J}(\theta_{n\mbox{\scriptsize\%}})<\hat{J}(\theta_{N,[1]}\}\)

1.3 A.3 An upper bound of \(\int_{-\infty}^{\infty}(1-n\mbox{\%}F_W(x))^{|N|}f_W(x)dx\)

1.4 A.4 The convexity of the bound in the RHS of Eq. 79

1.5 A.5 Solution to \(\min\limits_{p\in [0,1]}\{p+(1-n\mbox{\%}p)^{|N|}(1-p)\)

1.6 A.6 The quick formula and the closed form for \(n{\mbox{\%}}\)

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation