%PDF-1.2 % 12 0 obj << /Length 13 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -194.25 657.75 TD /F1 14.25 Tf -0.1312 Tc 0.0187 Tw (Microbenchmarks For Determining Branch Predictor Organization) Tj 392.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -195.75 -21 TD /F0 12 Tf 0 Tw ( ) Tj -143.25 -27.75 TD -0.0399 Tc 0.3399 Tw (Milena Milenkovic, Aleksandar Milenkovic, Jeffrey Kulick) Tj 285.75 0 TD 0 Tc 0 Tw ( ) Tj -264 -27.75 TD /F2 12 Tf -0.0114 Tc 0.3114 Tw (Electrical and Computer Engineering Department ) Tj 245.25 0 TD 0 Tc 0 Tw ( ) Tj -221.25 -27.75 TD -0.0023 Tc 0.0023 Tw (The University of Alabama in Huntsville) Tj 194.25 0 TD 0 Tc 0 Tw ( ) Tj -201 -27.75 TD 0.0051 Tc -0.0051 Tw (301 Sparkman Drive, Huntsville, AL 35899) Tj 208.5 0 TD 0 Tc 0 Tw ( ) Tj -225 -39 TD /F0 12 Tf 0.168 Tc (E) Tj 6.75 0 TD -0.246 Tc (-) Tj 5.25 0 TD -0.2844 Tc (mail:) Tj 23.25 0 TD -0.0013 Tc 0.5013 Tw ( {milenkm, milenka, kulick}@ece.uah.edu) Tj 205.5 0 TD /F2 12 Tf 0 Tc 0 Tw ( ) Tj -120 -20.25 TD /F0 12 Tf ( ) Tj -218.25 -12 TD /F0 9.75 Tf -0.1875 Tw ( ) Tj 0 -12 TD ( ) Tj 192.75 -12.75 TD /F3 12 Tf -0.0591 Tc 0 Tw (Summary) Tj 50.25 0 TD 0 Tc ( ) Tj -231 -26.25 TD /F0 11.25 Tf -0.0903 Tc 2.4278 Tw (In order to achieve an optimum performance of a given application on a given computer ) Tj -12 -24.75 TD -0.1547 Tc 1.6691 Tw (platform, a program developer or compiler must be aware of computer architecture parameters, ) Tj 0 -25.5 TD -0.1246 Tc 1.8121 Tw (including those related to) Tj 116.25 0 TD -0.1058 Tc 1.7933 Tw ( branch predictors. Although dynamic branch predictors are designed ) Tj -116.25 -25.5 TD -0.1157 Tc 0.3532 Tw (with the aim to automatically adapt to changes in branch behavior during program execution, code ) Tj 0 -25.5 TD -0.0927 Tc 2.7347 Tw (optimizations based on the information about predictor structure can greatly increase overa) Tj 431.25 0 TD -0.5025 Tc 0.69 Tw (ll ) Tj -431.25 -24.75 TD -0.0934 Tc 0.2809 Tw (program performance. ) Tj 102 0 TD -0.0904 Tc 0.3529 Tw (Yet, exact predictor implementations are seldom made public, even though ) Tj -102 -25.5 TD -0.136 Tc 0.3235 Tw (processor manuals provide valuable optimization hints. ) Tj 246 0 TD 0 Tc 0.1875 Tw ( ) Tj -234 -25.5 TD -0.0638 Tc 1.0549 Tw (This paper presents an experiment flow with a series of microbenchmarks that determine the ) Tj -12 -25.5 TD -0.1866 Tc 0 Tw (organization) Tj 54 0 TD -0.1549 Tc 1.0924 Tw ( and size of a branch predictor using on) Tj 179.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1321 Tc 0.7696 Tw (chip performance monitoring registers. Such ) Tj -237 -24.75 TD -0.1443 Tc 3.8818 Tw (knowledge can be used either for manual code optimization or for design of new, more ) Tj 0 -25.5 TD -0.05 Tc 0 Tw (architecture) Tj 52.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.123 Tc 1.5855 Tw (aware compilers. Three examples illustrate how insight into exact branch ) Tj 341.25 0 TD -0.1792 Tc 0.3667 Tw (predictor ) Tj -397.5 -25.5 TD -0.1618 Tc 3.1762 Tw (organization can be directly applied to code optimization. The proposed experiment flow is ) Tj 0 -24.75 TD -0.1282 Tc 1.008 Tw (illustrated with microbenchmarks tuned for Intel Pentium III and Pentium 4 processors, although ) Tj 0 -25.5 TD -0.0821 Tc 0.2696 Tw (they can easily be adapted for other architectures. The des) Tj 258.75 0 TD -0.1197 Tc 0.3072 Tw (cribed approach can also be used during ) Tj ET endstream endobj 13 0 obj 3492 endobj 4 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R /F1 8 0 R /F2 10 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 12 0 R >> endobj 17 0 obj << /Length 18 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.0916 Tc 1.8416 Tw (processor design for performance evaluation of various branch predictor organizations and for ) Tj 0 -25.5 TD -0.1601 Tc 0.4976 Tw (testing and validation during implementation. ) Tj 203.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -191.25 -24.75 TD /F3 11.25 Tf -0.0825 Tc 0.27 Tw (Keywords: ) Tj 62.25 0 TD /F0 11.25 Tf -0.1667 Tc 8.0042 Tw (compiler optimizations, microbenchmarks, branch predictor, perfor) Tj 333 0 TD -0.0225 Tc 0.21 Tw (mance ) Tj -407.25 -25.5 TD -0.2006 Tc 0 Tw (monitoring) Tj 48 0 TD 0 Tc 0.1875 Tw ( ) Tj -48 -40.5 TD /F3 14.25 Tf -0.0909 Tc 0 Tw (Introduction) Tj 75.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -63.75 -33 TD /F0 11.25 Tf -0.138 Tc 1.0255 Tw (Better performance of today\222s microprocessors is not only due to the increase in the operating ) Tj -12 -25.5 TD -0.1554 Tc 3.8251 Tw (frequency, but also due to the increase in processor complexity in every new generation. ) Tj 0 -25.5 TD -0.0887 Tc 0.3834 Tw (Compilers must keep up with new processor ) Tj 201.75 0 TD -0.0848 Tc 0.2723 Tw (features, such as extended instruction set, pipelining, ) Tj -201.75 -24.75 TD -0.3759 Tc 0 Tw (multiple) Tj 36 0 TD 0.0038 Tc (-) Tj 4.5 0 TD -0.1325 Tc 0.4564 Tw (level cache hierarchy, instruction level parallelism, and branch prediction, exploiting new ) Tj -40.5 -25.5 TD -0.0972 Tc 2.6847 Tw (optimization possibilities. Although compilers for new processors do include some advanced) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -25.5 TD -0.2344 Tc 0 Tw (optimiza) Tj 38.25 0 TD -0.13 Tc 0.3175 Tw (tion features, for instance the Intel C++ Compiler [) Tj 225 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.1501 Tc 0.2305 Tw (], future compilers must be even more ) Tj -268.5 -25.5 TD -0.1167 Tc 0.3667 Tw (aware of the underlying architecture. Currently, program developers must specifically set compiler ) Tj 0 -24.75 TD -0.081 Tc 0.2685 Tw (switches that notify the co) Tj 116.25 0 TD -0.0704 Tc 0.1897 Tw (mpiler for which architecture to optimize the code. The Intel processors ) Tj -116.25 -25.5 TD -0.109 Tc 1.0465 Tw (also include CPUID) Tj 90.75 0 TD 0 Tc 0 Tw (\227) Tj 10.5 0 TD -0.1592 Tc 1.0217 Tw (CPU Identification Instruction that provides information about some of the ) Tj -101.25 -25.5 TD -0.1214 Tc 1.0589 Tw (processor features, such as cache and TLB [) Tj 199.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.0862 Tc 1.0238 Tw (]. Ano) Tj 29.25 0 TD -0.1401 Tc 0.8901 Tw (ther way to extract required information is to ) Tj -234 -24.75 TD -0.1275 Tc 2.3605 Tw (perform a series of microbenchmarks that experimentally explore architectural properties. For ) Tj 0 -25.5 TD -0.0803 Tc 0.4345 Tw (instance, a program can automatically determine memory hierarchy parameters [) Tj 360.75 0 TD 0.375 Tc 0 Tw (2) Tj 5.25 0 TD -0.185 Tc 0.3725 Tw (], [) Tj 12.75 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD -0.0064 Tc 0.1939 Tw (]. This kind ) Tj -384 -25.5 TD -0.1008 Tc 1.6812 Tw (of program can be incorporated into future compilers: the compiler would first assess relevant ) Tj 0 -25.5 TD -0.122 Tc 2.3453 Tw (architectural parameters of a processor and then optimize the code according to the obtained ) Tj 0 -24.75 TD 0.0025 Tc 0 Tw (parameter) Tj 44.25 0 TD -0.1405 Tc 1.7126 Tw ( values. The information about underlying architecture can also be applied to manual ) Tj -44.25 -25.5 TD -0.1521 Tc 0.4021 Tw (code optimizations, such as a blocking transformation that improves code spatial locality [) Tj 397.5 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD 0.0956 Tc 0.0919 Tw (]. ) Tj 9.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -400.5 -25.5 TD -0.0731 Tc 1.9481 Tw (The successful resolution of co) Tj 144.75 0 TD -0.065 Tc 1.7525 Tw (nditional branches is a crucial performance issue in modern ) Tj -156.75 -25.5 TD -0.0775 Tc 1.0775 Tw (superscalar processors. When a conditional branch enters the execution pipeline, all instructions ) Tj 0 -24.75 TD -0.1224 Tc 1.1536 Tw (following the branch must wait for the branch resolution. A common solution to this problem is ) Tj 0 -25.5 TD -0.2487 Tc 0 Tw (spe) Tj 15 0 TD -0.1248 Tc 0.4276 Tw (culative execution: the branch outcome and/or its target are dynamically or statically predicted, ) Tj ET endstream endobj 18 0 obj 4402 endobj 16 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 17 0 R >> endobj 20 0 obj << /Length 21 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.0788 Tc 1.8163 Tw (so the execution can go on without stalls. If a branch is mispredicted, speculatively executed ) Tj 0 -25.5 TD -0.1535 Tc 2.341 Tw (instructions must be flushed and their results discarded, thus ) Tj 288 0 TD -0.1382 Tc 2.1257 Tw (wasting a significant number of ) Tj -288 -24.75 TD -0.1302 Tc 0.4583 Tw (processor clock cycles. For example, the Pentium 4 has a misprediction penalty of 20 clock cycles ) Tj 0 -25.5 TD 0.0038 Tc 0 Tw ([) Tj 3.75 0 TD 0.375 Tc (5) Tj 5.25 0 TD -0.0801 Tc 1.0176 Tw (], and future processors may have even higher penalties, up to 50 clock cycles [) Tj 364.5 0 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD -0.0607 Tc 0.4982 Tw (], since deep ) Tj -378.75 -25.5 TD -0.1202 Tc 0.2244 Tw (pipelines are necessary for achieving very high clock frequencies. ) Tj 293.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -281.25 -25.5 TD -0.1251 Tc 0.4126 Tw (With static branch prediction, a branch outcome is predicted statically at compile time using the ) Tj -12 -24.75 TD -0.0996 Tc 1.0371 Tw (branch type, branch direction, and/) Tj 156.75 0 TD -0.1165 Tc 1.054 Tw (or profiling information. Although static prediction may work ) Tj -156.75 -25.5 TD -0.1463 Tc 2.5338 Tw (well for some applications, dynamic prediction solves more general cases, since it is able to ) Tj 0 -25.5 TD -0.0927 Tc 1.0302 Tw (automatically adapt to changes in branch behavior during program execution. Predictor size and ) Tj T* -0.0232 Tc 0 Tw (organ) Tj 24.75 0 TD -0.1296 Tc 0.3639 Tw (ization may limit its ability to give a correct prediction. If the compiler/developer is aware of ) Tj -24.75 -24.75 TD -0.0975 Tc 1.785 Tw (the branch predictor intricacies, the code can be optimized to overcome some limitations, and ) Tj 0 -25.5 TD -0.1272 Tc 0.1897 Tw (consequently the overall program performance increases. ) Tj 255 0 TD 0 Tc 0.1875 Tw ( ) Tj -243 -25.5 TD -0.1862 Tc -0.3763 Tw (Modern ) Tj 40.5 0 TD -0.0938 Tc 2.9063 Tw (processors, such as Intel Pentium III \(P6 architecture\) and Pentium 4 \(NetBurst ) Tj -52.5 -24.75 TD -0.1028 Tc 4.1153 Tw (architecture\), include some form of dynamic branch prediction mechanisms, but available) Tj 0 Tc 0.9375 Tw ( ) Tj 0 -25.5 TD -0.1337 Tc 2.5087 Tw (information about exact predictor organization is rather scarce. On the other hand, almo) Tj 412.5 0 TD -0.0008 Tc 1.3133 Tw (st all ) Tj -412.5 -25.5 TD -0.0903 Tc 0.2778 Tw (modern processors include performance) Tj 177.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0927 Tc 0.2802 Tw (monitoring registers that can count several branch) Tj 221.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0873 Tc -0.4752 Tw (related ) Tj -406.5 -25.5 TD -0.1531 Tc 0.3406 Tw (events, and quite powerful tools for easy access to these registers are available [) Tj 351 0 TD 0.375 Tc 0 Tw (7) Tj 5.25 0 TD -0.185 Tc 0.3725 Tw (], [) Tj 12.75 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD 0.0956 Tc 0.0919 Tw (]. ) Tj 9.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -372 -24.75 TD -0.1012 Tc 2.4762 Tw (This paper presents an experiment flow that uncovers branch predictor organization using ) Tj -12 -25.5 TD -0.0542 Tc 0 Tw (performance) Tj 56.25 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.152 Tc 5.5895 Tw (monitoring registers and illustrates how such knowledge can improve code) Tj 0 Tc 0.1875 Tw ( ) Tj -60 -25.5 TD -0.1231 Tc 0.3106 Tw (optimization. A set of \223spy\224 microbenchmarks tests th) Tj 240 0 TD -0.0977 Tc 0.0709 Tw (e existence and/or value of particular branch ) Tj -240 -25.5 TD -0.1021 Tc 0.3833 Tw (predictor parameters: the use of global and/or local branch history, the number of history bits, and ) Tj 0 -24.75 TD -0.1059 Tc 1.0434 Tw (the predictor size and organization \(http://www.ece.uah.edu/~lacasa/\).) Tj 314.25 0 TD -0.1187 Tc 1.0562 Tw ( Another application of the ) Tj -314.25 -25.5 TD -0.0862 Tc 2.5238 Tw (proposed ex) Tj 55.5 0 TD -0.1207 Tc 2.4428 Tw (periment flow is for testing and validation of the branch predictor design during ) Tj -55.5 -25.5 TD -0.1069 Tc 0.8569 Tw (processor implementation. The microbenchmarks can also be used ) Tj 304.5 0 TD -0.1188 Tc 1.0563 Tw (in research looking for better ) Tj -304.5 -25.5 TD -0.1125 Tc 0.3 Tw (branch predictors.) Tj 79.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 21 0 obj 4457 endobj 19 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R >> /ProcSet 2 0 R >> /Contents 20 0 R >> endobj 23 0 obj << /Length 24 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1219 Tc 1.2261 Tw (The proposed experiment flow is illustrated on Pentium III an) Tj 281.25 0 TD -0.108 Tc 1.0455 Tw (d Pentium 4 processors, though ) Tj -293.25 -25.5 TD -0.1134 Tc 0.3009 Tw (only minor modifications are necessary to adapt the proposed microbenchmarks to other processor ) Tj 0 -24.75 TD -0.0206 Tc 0.2081 Tw (architectures. ) Tj 64.5 0 TD -0.1034 Tc 1.8981 Tw (The results indicate that Pentium III has a local branch predictor with 4 branch ) Tj -64.5 -25.5 TD -0.1902 Tc 4.1277 Tw (history bits, and the Pentium 4 ) Tj 4.125 Tc 0 Tw (u) Tj 164.25 0 TD -0.123 Tc 4.0605 Tw (ses a global branch predictor with 16 history bits. The) Tj 0 Tc 0.1875 Tw ( ) Tj -164.25 -25.5 TD -0.1298 Tc 0.9173 Tw (experiments also determine the organization of the branch target buffer and the address bits used ) Tj 0 -25.5 TD -0.1411 Tc 0.3286 Tw (to access it. ) Tj 54 0 TD 0 Tc 0.1875 Tw ( ) Tj -42 -24.75 TD -0.1097 Tc 0.4222 Tw (The next section provides an overview of dynamic branch prediction, followed by example) Tj 407.25 0 TD 0.1675 Tc 0.395 Tw (s of ) Tj -419.25 -25.5 TD -0.2625 Tc 0 Tw (predictor) Tj 39 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.1166 Tc 0.8041 Tw (aware code optimizations. A description of the experimental environment sets the stage ) Tj -42.75 -25.5 TD -0.0801 Tc 0.3676 Tw (for a detailed explanation of the proposed experiment flow. Finally, the results of the experiments ) Tj 0 -25.5 TD -0.118 Tc 0.3055 Tw (for observed architectures are presented.) Tj 179.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -179.25 -39.75 TD /F3 14.25 Tf -0.0682 Tc 0.2557 Tw (Dynamic Bran) Tj 87.75 0 TD 0.0352 Tc -0.5978 Tw (ch Prediction) Tj 80.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -156 -33.75 TD /F0 11.25 Tf -0.135 Tc 1.8666 Tw (No matter how complex a branch predictor is, it can be described by some variation of the ) Tj -12 -24.75 TD -0.0456 Tc 2.3831 Tw (general scheme \(Figure 1\), consisting of two major parts: a branch target buffer \(BTB\) for ) Tj 0 -25.5 TD -0.1657 Tc 0.3532 Tw (prediction of branch targets, and an outcome predictor for pr) Tj 266.25 0 TD -0.1165 Tc 0.304 Tw (ediction of branch outcomes. ) Tj 131.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -385.5 -25.5 TD -0.14 Tc 0.8801 Tw (The branch target buffer is a cache structure, where a part of the branch address is used as the ) Tj -12 -24.75 TD -0.0493 Tc 0.2368 Tw (cache index, and the cache data is the last target address of that branch. More complex BTBs can ) Tj 0 -25.5 TD -0.1244 Tc 1.2119 Tw (hold more than one possible targ) Tj 148.5 0 TD -0.0889 Tc 0.9583 Tw (et address and some type of mechanism to choose which target ) Tj -148.5 -25.5 TD -0.1031 Tc 4.0406 Tw (instructions should be speculatively executed. Some implementations can also store target) Tj 0 Tc 0.1875 Tw ( ) Tj 0 -25.5 TD -0.114 Tc 1.8015 Tw (instructions, and even whole target basic blocks [) Tj 227.25 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.3106 Tc 1.9981 Tw (]. The predic) Tj 59.25 0 TD -0.1837 Tc 1.4963 Tw (tion of branch outcomes can be ) Tj -291.75 -24.75 TD -0.0981 Tc 1.8324 Tw (coupled or decoupled with the BTB: if the outcome predictor and the BTB are coupled, only ) Tj 0 -25.5 TD -0.1315 Tc 1.1524 Tw (branches that hit in the BTB are predicted, while a static prediction algorithm is used on a BTB ) Tj T* -0.1911 Tc 0.3786 Tw (miss. If the BTB is decoupled from the) Tj 171.75 0 TD -0.1251 Tc 0.3126 Tw ( outcome predictor, all branch outcomes are predicted using ) Tj -171.75 -25.5 TD -0.1459 Tc 0.3334 Tw (the outcome predictor.) Tj 99 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 24 0 obj 3761 endobj 22 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 23 0 R >> endobj 28 0 obj << /Length 29 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj ET BT 1.0038 0 0 1 255.75 709.5 Tm /F4 8.1225 Tf -0.284 Tc (BTB) Tj ET 255.75 708 15 0.75 re f 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG 246.75 692.25 m 285 692.25 l 285 703.5 l 246.75 703.5 l 246.75 692.25 l h b* 246.75 681 m 285 681 l 285 692.25 l 246.75 692.25 l 246.75 681 l h b* 246.75 669 m 285 669 l 285 681 l 246.75 681 l 246.75 669 l h b* BT 1.0038 0 0 1 262.5 672.75 Tm 0 0 0 rg -0.0165 Tc (...) Tj 1 1 1 rg ET 246.75 657.75 m 285 657.75 l 285 669 l 246.75 669 l 246.75 657.75 l h b* 300.75 645.75 m 375 645.75 l 375 703.5 l 300.75 703.5 l 300.75 645.75 l S BT 1.0038 0 0 1 303.75 708.75 Tm 0 0 0 rg -0.0745 Tc 0.0574 Tw (Outcome Predictor) Tj ET 303.75 707.25 66.75 0.75 re f 242.25 640.5 m 380.25 640.5 l 380.25 719.25 l 242.25 719.25 l 242.25 640.5 l S 1 1 1 rg 246.75 645.75 m 285 645.75 l 285 657.75 l 246.75 657.75 l 246.75 645.75 l h b* BT 381 639.75 TD 0 0 0 rg /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 3 0 TD ( ) Tj -181.5 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.1262 Tc 0.1262 Tw ( General Branch Predictor Scheme.) Tj 167.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -309.75 -30.75 TD /F0 11.25 Tf -0.1557 Tc 0.2855 Tw (Dynamic prediction of a branch outcome is based on the state of a finite) Tj 317.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1399 Tc 0.7024 Tw (state machine, which is ) Tj -333 -25.5 TD -0.1044 Tc 0 Tw (usuall) Tj 25.5 0 TD 0.201 Tc -0.0135 Tw (y a two) Tj 33.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1064 Tc 0.2939 Tw (bit saturating counter [) Tj 100.5 0 TD 0.375 Tc 0 Tw (9) Tj 5.25 0 TD -0.0772 Tc 0.5147 Tw (], depicted in ) Tj 62.25 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 31.5 0 TD 0.375 Tc 0 Tw (2) Tj 5.25 0 TD 0.1188 Tc 0.0687 Tw (. In states ) Tj 47.25 0 TD /F2 11.25 Tf -0.1439 Tc 0.3314 Tw (Strongly Taken) Tj 67.5 0 TD /F0 11.25 Tf -0.165 Tc 0.3525 Tw ( and) Tj 18.75 0 TD /F2 11.25 Tf -0.1431 Tc -0.0444 Tw ( Weakly ) Tj -401.25 -24.75 TD -0.099 Tc 0 Tw (Taken) Tj 27.75 0 TD /F0 11.25 Tf -0.1709 Tc 0.319 Tw ( a branch is predicted as taken, and it is predicted as not taken in the other two states, ) Tj 376.5 0 TD /F2 11.25 Tf 0.0773 Tc 0 Tw (Weakl) Tj 27.75 0 TD -0.495 Tc -0.0675 Tw (y ) Tj -432 -25.5 TD -0.0314 Tc 0.2189 Tw (Not Taken) Tj 45.75 0 TD /F0 11.25 Tf -0.165 Tc -0.0225 Tw ( and ) Tj 21.75 0 TD /F2 11.25 Tf -0.1486 Tc -0.0389 Tw (Strongly Not Taken) Tj 85.5 0 TD /F0 11.25 Tf 0.1875 Tc 0 Tw (. ) Tj 6 0 TD 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg ET 203.25 453 m 204 454.5 l 204 456 l 204.75 457.5 l 206.25 459.75 l 209.25 462.75 l 211.5 463.5 l 213.75 465 l 216.75 465.75 l 219 467.25 l 222 468 l 225.75 468.75 l 228.75 468.75 l 232.5 469.5 l 242.25 469.5 l 246 468.75 l 249 468.75 l 255 467.25 l 258 465.75 l 260.25 465 l 262.5 463.5 l 264.75 462.75 l 267 461.25 l 268.5 459.75 l 269.25 457.5 l 270.75 454.5 l 270.75 451.5 l 270 449.25 l 268.5 446.25 l 267 444.75 l 262.5 441.75 l 260.25 441 l 258 439.5 l 246 436.5 l 242.25 436.5 l 239.25 435.75 l 235.5 435.75 l 232.5 436.5 l 228.75 436.5 l 225.75 437.25 l 222 438 l 219 438.75 l 216.75 439.5 l 213.75 441 l 211.5 441.75 l 209.25 443.25 l 204.75 447.75 l 204 449.25 l 204 451.5 l 203.25 453 l h b* BT 1.0046 0 0 1 212.25 455.25 Tm 0 0 0 rg /F5 8.8739 Tf -0.1526 Tc -0.0749 Tw (Strongly NT) Tj 20.1579 -10.5 TD 0.2909 Tc 0 Tw (00) Tj ET 203.25 453 m 201.75 455.25 l 201 457.5 l 199.5 459.75 l 198.75 462 l 198.75 467.25 l 199.5 470.25 l 201 474.75 l 204 479.25 l 208.5 482.25 l 215.25 484.5 l 223.5 484.5 l 225.75 483 l 228 482.25 l 232.5 479.25 l 234 477 l S 231 476.25 m 237 469.5 l 236.25 478.5 l 231 476.25 l h f* 1 1 1 rg 198 485.25 m 198 473.25 l 210.75 473.25 l 210.75 485.25 l h f* BT 0.9983 0 0 1 198.75 476.25 Tm 0 0 0 rg /F4 8.8739 Tf 0.4716 Tc (NT) Tj 1 1 1 rg ET 344.25 453 m 344.25 454.5 l 345.75 457.5 l 347.25 459.75 l 350.25 462.75 l 352.5 463.5 l 354.75 465 l 357 465.75 l 360 467.25 l 366 468.75 l 369.75 468.75 l 372.75 469.5 l 383.25 469.5 l 386.25 468.75 l 390 468.75 l 396 467.25 l 399 465.75 l 401.25 465 l 403.5 463.5 l 405.75 462.75 l 408.75 459.75 l 410.25 457.5 l 411.75 454.5 l 411.75 451.5 l 411 449.25 l 410.25 447.75 l 405.75 443.25 l 403.5 441.75 l 401.25 441 l 399 439.5 l 390 437.25 l 386.25 436.5 l 383.25 436.5 l 379.5 435.75 l 376.5 435.75 l 372.75 436.5 l 369.75 436.5 l 366 437.25 l 357 439.5 l 354.75 441 l 352.5 441.75 l 350.25 443.25 l 345.75 447.75 l 345 449.25 l 344.25 451.5 l 344.25 453 l h b* BT 1.0046 0 0 1 355.5 455.25 Tm 0 0 0 rg /F5 8.8739 Tf 0.0183 Tc -0.2443 Tw (Weakly NT) Tj 17.1716 -10.5 TD 0.2909 Tc 0 Tw (01) Tj ET 264 462.75 m 267.75 464.25 l 270.75 466.5 l 279.75 471 l 283.5 471.75 l 286.5 472.5 l 289.5 474 l 292.5 474.75 l 296.25 474.75 l 299.25 475.5 l 302.25 475.5 l 305.25 476.25 l 308.25 476.25 l 312 475.5 l 318 475.5 l 324 474 l 327.75 473.25 l 330.75 472.5 l 333.75 471 l 336.75 470.25 l 340.5 468.75 l 343.5 467.25 l 346.5 465 l S 347.25 468 m 351 462.75 l 344.25 463.5 l 347.25 468 l h f* 1 1 1 rg 304.5 482.25 m 304.5 470.25 l 310.5 470.25 l 310.5 482.25 l h f* BT 0.9983 0 0 1 304.5 473.25 Tm 0 0 0 rg /F4 8.8739 Tf -0.1629 Tc (T) Tj ET 351 442.5 m 348 441 l 344.25 439.5 l 341.25 437.25 l 338.25 435.75 l 335.25 435 l 332.25 433.5 l 328.5 432.75 l 316.5 429.75 l 300 429.75 l 297 430.5 l 294 430.5 l 288 432 l 284.25 433.5 l 281.25 434.25 l 275.25 437.25 l 271.5 438.75 l 268.5 440.25 l S 270.75 442.5 m 264 442.5 l 267.75 437.25 l 270.75 442.5 l h f* 1 1 1 rg 300.75 435 m 300.75 423.75 l 314.25 423.75 l 314.25 435 l h f* BT 0.9983 0 0 1 301.5 426.75 Tm 0 0 0 rg 0.4716 Tc (NT) Tj 1 1 1 rg ET 207 372.75 m 207 375 l 209.25 379.5 l 210.75 381 l 217.5 385.5 l 219.75 386.25 l 228.75 388.5 l 232.5 389.25 l 235.5 389.25 l 239.25 390 l 242.25 390 l 246 389.25 l 249 389.25 l 252 388.5 l 255.75 387.75 l 258.75 387 l 261 386.25 l 264 385.5 l 268.5 382.5 l 271.5 379.5 l 273 378 l 273.75 376.5 l 273.75 375 l 274.5 372.75 l 273.75 371.25 l 273.75 369.75 l 273 368.25 l 271.5 366 l 270 364.5 l 268.5 363.75 l 264 360.75 l 261 360 l 258.75 358.5 l 255.75 357.75 l 252 357 l 249 357 l 246 356.25 l 235.5 356.25 l 232.5 357 l 228.75 357 l 222.75 358.5 l 219.75 360 l 217.5 360.75 l 213 363.75 l 210.75 364.5 l 209.25 366 l 208.5 368.25 l 207 371.25 l 207 372.75 l h b* BT 1.0046 0 0 1 219 375.75 Tm 0 0 0 rg /F5 8.8739 Tf -0.1215 Tc -0.1057 Tw (Strongly T) Tj 16.425 -10.5 TD 0.2909 Tc 0 Tw (10) Tj ET 207 372.75 m 204 377.25 l 203.25 379.5 l 202.5 382.5 l 202.5 390 l 203.25 392.25 l 207.75 399 l 209.25 400.5 l 213.75 403.5 l 216.75 404.25 l 219 405 l 224.25 405 l 228.75 403.5 l 231.75 402 l 233.25 400.5 l 235.5 399 l 237 397.5 l S 234 396.75 m 240.75 390 l 239.25 399 l 234 396.75 l h f* 1 1 1 rg 204.75 405 m 204.75 393.75 l 210.75 393.75 l 210.75 405 l h f* BT 0.9983 0 0 1 204.75 396.75 Tm 0 0 0 rg /F4 8.8739 Tf -0.1629 Tc (T) Tj 1 1 1 rg ET 347.25 372.75 m 348 375 l 348 376.5 l 349.5 378 l 350.25 379.5 l 353.25 382.5 l 355.5 384 l 358.5 385.5 l 360.75 386.25 l 372.75 389.25 l 376.5 389.25 l 379.5 390 l 383.25 390 l 386.25 389.25 l 390 389.25 l 402 386.25 l 404.25 385.5 l 407.25 384 l 408.75 382.5 l 411 381 l 412.5 379.5 l 413.25 378 l 414.75 376.5 l 414.75 369.75 l 413.25 368.25 l 412.5 366 l 411 364.5 l 408.75 363.75 l 407.25 362.25 l 404.25 360.75 l 402 360 l 399 358.5 l 393 357 l 390 357 l 386.25 356.25 l 376.5 356.25 l 372.75 357 l 369.75 357 l 363.75 358.5 l 360.75 360 l 358.5 360.75 l 355.5 362.25 l 353.25 363.75 l 351.75 364.5 l 350.25 366 l 349.5 368.25 l 348 369.75 l 348 371.25 l 347.25 372.75 l h b* BT 1.0046 0 0 1 362.25 375.75 Tm 0 0 0 rg /F5 8.8739 Tf 0.0827 Tc -0.3081 Tw (Weakly T) Tj 14.1852 -10.5 TD 0.2909 Tc 0 Tw (11) Tj ET 267.75 382.5 m 270.75 384.75 l 276.75 387.75 l 280.5 389.25 l 286.5 392.25 l 289.5 393 l 293.25 393.75 l 302.25 396 l 318 396 l 321 395.25 l 324.75 394.5 l 327.75 394.5 l 330.75 393 l 333.75 392.25 l 337.5 391.5 l 349.5 385.5 l S 351 388.5 m 354 382.5 l 348 383.25 l 351 388.5 l h f* 1 1 1 rg 304.5 402 m 304.5 390 l 317.25 390 l 317.25 402 l h f* BT 0.9983 0 0 1 305.25 393.75 Tm 0 0 0 rg /F4 8.8739 Tf 0.4716 Tc (NT) Tj ET 354 363 m 351 360.75 l 345 357.75 l 341.25 356.25 l 338.25 354.75 l 335.25 354 l 332.25 352.5 l 329.25 351.75 l 325.5 351 l 322.5 350.25 l 319.5 350.25 l 316.5 349.5 l 306.75 349.5 l 303.75 350.25 l 300.75 350.25 l 297 351 l 288 353.25 l 285 354.75 l 281.25 355.5 l 272.25 360 l S 274.5 362.25 m 267.75 363 l 271.5 357.75 l 274.5 362.25 l h f* 1 1 1 rg 307.5 355.5 m 307.5 343.5 l 314.25 343.5 l 314.25 355.5 l h f* BT 0.9983 0 0 1 308.25 347.25 Tm 0 0 0 rg -0.1629 Tc (T) Tj ET 390.75 435.75 m 393 435 l 395.25 433.5 l 402.75 426 l 403.5 424.5 l 405 423 l 406.5 420 l 406.5 418.5 l 407.25 417 l 407.25 408.75 l 406.5 407.25 l 405.75 405 l 405.75 403.5 l 404.25 402 l 403.5 400.5 l 402.75 398.25 l 399.75 395.25 l 398.25 393 l 394.5 389.25 l S 393 392.25 m 390.75 386.25 l 397.5 387.75 l 393 392.25 l h f* 1 1 1 rg 404.25 418.5 m 404.25 407.25 l 411 407.25 l 411 418.5 l h f* BT 0.9983 0 0 1 405 410.25 Tm 0 0 0 rg (T) Tj ET 371.25 389.25 m 366.75 392.25 l 357.75 401.25 l 354.75 407.25 l 354.75 410.25 l 354 411.75 l 354 413.25 l 354.75 414.75 l 354.75 417.75 l 357.75 423.75 l 360.75 426.75 l 361.5 428.25 l 363 429.75 l 365.25 431.25 l 366.75 432.75 l S 364.5 435 m 371.25 435.75 l 368.25 430.5 l 364.5 435 l h f* 1 1 1 rg 348 418.5 m 348 407.25 l 360.75 407.25 l 360.75 418.5 l h f* BT 0.9983 0 0 1 348 410.25 Tm 0 0 0 rg 0.4716 Tc (NT) Tj ET BT 429.75 341.25 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -288 -23.25 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (2) Tj 5.25 0 TD -0.3338 Tc 0.5213 Tw ( Two) Tj 24 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.136 Tc 0.0735 Tw (bit saturating counter: T=taken branch; NT=not taken.) Tj 260.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -369.75 -30 TD /F0 11.25 Tf -0.153 Tc 1.0905 Tw (This counter is a cell of a branch prediction table \(BPT\), which could be accessed in different ) Tj -12 -25.5 TD -0.0859 Tc 1.9234 Tw (ways. The simplest BPT index ) Tj 151.5 0 TD -0.1051 Tc 2.7926 Tw (is a portion of the branch address. More complex two) Tj 259.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.024 Tc 0.2115 Tw (level ) Tj -414.75 -25.5 TD -0.1279 Tc 0.2321 Tw (predictors combine the branch address or a part of it with a branch history register BHR. The BHR ) Tj 0 -25.5 TD -0.0906 Tc 0.3614 Tw (is a shift register that keeps the history of ) Tj 187.5 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.0631 Tc 0.2506 Tw ( most recent branch outcomes, where ) Tj 169.5 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.0658 Tc 0.2533 Tw ( represents the) Tj 64.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -436.5 -24.75 TD -0.1323 Tc 0.3198 Tw (number of bits of the shift register [) Tj 157.5 0 TD -0.375 Tc 0 Tw (10) Tj 10.5 0 TD -0.185 Tc 0.3725 Tw (], [) Tj 12.75 0 TD -0.375 Tc 0 Tw (11) Tj 10.5 0 TD -0.142 Tc 0.4795 Tw (]. The BPT index function is usually a concatenation or ) Tj -191.25 -25.5 TD -0.1379 Tc 0.3254 Tw (exclusive ) Tj 45 0 TD /F2 11.25 Tf 0.0019 Tc 0 Tw (OR) Tj 15 0 TD /F0 11.25 Tf -0.0731 Tc 0.8743 Tw ( of the branch address and the corresponding BHR. Based on ) Tj 284.25 0 TD -0.0496 Tc 0.6121 Tw (the type of recorded ) Tj -344.25 -25.5 TD -0.1464 Tc 1.8339 Tw (branch history, the predictors can be global and local. Global two) Tj 302.25 0 TD 0.0038 Tc 0 Tw (-) Tj 4.5 0 TD -0.1431 Tc 1.4556 Tw (level predictors benefit from ) Tj -306.75 -24.75 TD -0.1118 Tc 0.966 Tw (correlations between subsequent branches in the program execution flow \() Tj 334.5 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 32.25 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD -0.1093 Tc 0.7968 Tw (a\), while local ) Tj -372 -25.5 TD -0.1023 Tc 0.0398 Tw (predictors are based ) Tj 91.5 0 TD -0.1505 Tc 0.338 Tw (on correlation between subsequent executions of the same branch \() Tj 294.75 0 TD -0.2706 Tc 0.4581 Tw (Figure ) Tj 30.75 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD -0.0612 Tc 0.2487 Tw (b\). ) Tj 15 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 29 0 obj 12773 endobj 25 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F5 30 0 R >> /ProcSet 2 0 R >> /Contents 28 0 R >> endobj 33 0 obj << /Length 34 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1937 Tc 3.3812 Tw (In order to further reduce the number of branch mispredictions in wide) Tj 342.75 0 TD 0.0038 Tc 0 Tw (-) Tj 4.5 0 TD -0.1158 Tc 1.8033 Tw (issue superscalar ) Tj -359.25 -25.5 TD -0.1342 Tc 1.8217 Tw (processors, more advanced mechanisms have been proposed, such as hyb) Tj 337.5 0 TD -0.0456 Tc 1.2331 Tw (rid branch predictors. ) Tj -337.5 -24.75 TD -0.1027 Tc 1.7902 Tw (Hybrid branch predictors can include both global and local prediction mechanisms, as well as ) Tj 0 -25.5 TD -0.0852 Tc 1.8665 Tw (some other prediction schemes, e.g., specialized loop predictors [) Tj 301.5 0 TD -0.375 Tc 0 Tw (12) Tj 10.5 0 TD -0.0541 Tc 1.5916 Tw (]. Instead of exploiting the ) Tj -312 -25.5 TD -0.0281 Tc 0 Tw (corr) Tj 18 0 TD -0.1201 Tc 0.3076 Tw (elation between outcomes of the last ) Tj 164.25 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.045 Tc 0.2325 Tw ( branches \(pattern) Tj 79.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1372 Tc 0.4747 Tw (based\), the dynamic branch predictor ) Tj -273 -25.5 TD -0.155 Tc 2.5244 Tw (can use the information of the path to the current branch \(path) Tj 297 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.1988 Tc 2.2387 Tw (based\) [) Tj 37.5 0 TD -0.375 Tc 0 Tw (13) Tj 10.5 0 TD -0.1673 Tc 2.0423 Tw (]. The path history ) Tj -348.75 -24.75 TD -0.1092 Tc 0.8592 Tw (register stores address bits ) Tj 122.25 0 TD -0.1197 Tc 0.8429 Tw (from each of the most recently executed ) Tj 186 0 TD /F2 11.25 Tf -0.1237 Tc 0.3112 Tw (P ) Tj 10.5 0 TD /F0 11.25 Tf -0.1157 Tc 1.0532 Tw (branches, thus making the ) Tj -318.75 -25.5 TD -0.1422 Tc 4.8297 Tw (prediction path) Tj 70.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0932 Tc 4.7807 Tw (dependant. One predictor can combine both pattern) Tj 254.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1028 Tc 4.7903 Tw (based and path) Tj 75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0232 Tc -0.5393 Tw (based ) Tj -411 -25.5 TD -0.1408 Tc 2.2899 Tw (approaches. Specialized predictors can handle some special branch types, such as returns and ) Tj 0 -25.5 TD -0.1569 Tc 0 Tw (loops.) Tj 26.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG ET 142.5 429 m 218.25 429 l 218.25 444.75 l 142.5 444.75 l 142.5 429 l h b* BT 1.0063 0 0 1 148.5 434.25 Tm 0 0 0 rg /F5 8.3449 Tf 0.1498 Tc -0.2314 Tw (Branch Address) Tj 1 1 1 rg ET 227.25 429 m 303 429 l 303 444.75 l 227.25 444.75 l 227.25 429 l h b* BT 1.0063 0 0 1 242.25 434.25 Tm 0 0 0 rg -0.3371 Tc 0.2494 Tw (Global BHR) Tj 1 1 1 rg ET 189.75 387 m 265.5 387 l 265.5 402 l 189.75 402 l 189.75 387 l h b* BT 1.0063 0 0 1 198 392.25 Tm 0 0 0 rg 0.0274 Tc -0.1106 Tw (Index Function) Tj 1 1 1 rg ET 315.75 344.25 m 412.5 344.25 l 412.5 444.75 l 315.75 444.75 l 315.75 344.25 l h b* 265.5 394.5 m 309 394.5 l S 0 0 0 rg 309 396.75 m 315.75 394.5 l 309 392.25 l 309 396.75 l h f* 252.75 429 m 252.75 410.25 l S 255 411 m 252.75 404.25 l 250.5 411 l 255 411 l h f* 205.5 429 m 205.5 410.25 l S 207.75 411 m 205.5 404.25 l 203.25 411 l 207.75 411 l h f* BT 1.0063 0 0 1 315.75 451.5 Tm 0.0152 Tc -0.0984 Tw (Branch Prediction Table) Tj 109.5603 -39 TD 0.1559 Tc 0 Tw (Outcome) Tj -2.2359 -9.75 TD 0.0924 Tc (Prediction) Tj 0.7529 0.7529 0.7529 rg ET 323.25 435.75 m 323.25 436.5 l 324 437.25 l 324.75 437.25 l 325.5 438 l 328.5 438 l 329.25 437.25 l 330 437.25 l 330 434.25 l 329.25 434.25 l 328.5 433.5 l 325.5 433.5 l 324.75 434.25 l 324 434.25 l 323.25 435 l 323.25 435.75 l h b* 336.75 435.75 m 336.75 437.25 l 337.5 437.25 l 338.25 438 l 341.25 438 l 342 437.25 l 342.75 437.25 l 343.5 436.5 l 343.5 435 l 342.75 434.25 l 342 434.25 l 341.25 433.5 l 338.25 433.5 l 337.5 434.25 l 336.75 434.25 l 336.75 435.75 l h b* 327 438 m 327 440.25 l 325.5 441.75 l 324.75 441.75 l 324 442.5 l 322.5 441.75 l 321.75 441.75 l 320.25 440.25 l 320.25 438 l 322.5 435.75 l 323.25 435.75 l S 338.25 438 m 336.75 439.5 l 336 439.5 l 335.25 440.25 l 332.25 440.25 l 331.5 439.5 l 330.75 439.5 l 329.25 438 l S 338.25 434.25 m 336 432 l 331.5 432 l 329.25 434.25 l S 323.25 426 m 324.75 427.5 l 325.5 427.5 l 326.25 428.25 l 327.75 428.25 l 328.5 427.5 l 329.25 427.5 l 330 426.75 l 330.75 426.75 l 330.75 425.25 l 329.25 423.75 l 324.75 423.75 l 324 424.5 l 324 425.25 l 323.25 426 l h b* 336.75 426 m 336.75 426.75 l 337.5 426.75 l 338.25 427.5 l 339 427.5 l 339.75 428.25 l 340.5 428.25 l 342 427.5 l 342.75 427.5 l 342.75 426.75 l 343.5 426.75 l 343.5 425.25 l 342.75 424.5 l 342.75 423.75 l 338.25 423.75 l 336.75 425.25 l 336.75 426 l h b* 327 428.25 m 327 429.75 l 326.25 430.5 l 325.5 432 l 322.5 432 l 321 430.5 l 321 427.5 l 322.5 426 l 323.25 426 l S 338.25 427.5 m 336 429.75 l 334.5 429.75 l 333.75 430.5 l 333 429.75 l 331.5 429.75 l 329.25 427.5 l S 338.25 423.75 m 337.5 423 l 336.75 423 l 336 422.25 l 335.25 422.25 l 334.5 421.5 l 333 421.5 l 332.25 422.25 l 331.5 422.25 l 330.75 423 l 330 423 l 329.25 423.75 l S 341.25 428.25 m 342 428.25 l 343.5 429.75 l 343.5 432 l 342.75 432 l 342.75 432.75 l 342 432.75 l 341.25 433.5 l S 339 428.25 m 338.25 428.25 l 337.5 429 l 337.5 429.75 l 336.75 429.75 l 336.75 432 l 337.5 432 l 337.5 432.75 l 338.25 432.75 l 339 433.5 l S BT 358.5 423.75 TD 0 0 0 rg /F4 16.6897 Tf -0.1398 Tc (...) Tj 0.7529 0.7529 0.7529 rg ET 386.25 435.75 m 386.25 436.5 l 387 437.25 l 388.5 438 l 390.75 438 l 391.5 437.25 l 392.25 437.25 l 393 436.5 l 393 435 l 392.25 434.25 l 391.5 434.25 l 390.75 433.5 l 388.5 433.5 l 387 434.25 l 386.25 435 l 386.25 435.75 l h b* 399 435.75 m 399.75 436.5 l 399.75 437.25 l 400.5 437.25 l 401.25 438 l 404.25 438 l 405 437.25 l 405.75 437.25 l 405.75 436.5 l 406.5 435.75 l 405.75 435 l 405.75 434.25 l 405 434.25 l 404.25 433.5 l 401.25 433.5 l 400.5 434.25 l 399.75 434.25 l 399.75 435 l 399 435.75 l h b* 389.25 438 m 390 438.75 l 389.25 440.25 l 389.25 441 l 388.5 441.75 l 387.75 441.75 l 386.25 442.5 l 385.5 441.75 l 384.75 441.75 l 383.25 440.25 l 383.25 438 l 384 437.25 l 384 436.5 l 385.5 435.75 l 386.25 435.75 l S 400.5 438 m 399 439.5 l 398.25 439.5 l 397.5 440.25 l 394.5 440.25 l 392.25 438 l S 400.5 434.25 m 398.25 432 l 393.75 432 l 393.75 432.75 l 392.25 434.25 l S 386.25 426 m 386.25 426.75 l 387 426.75 l 387.75 427.5 l 388.5 427.5 l 389.25 428.25 l 390 428.25 l 391.5 427.5 l 392.25 427.5 l 393 426.75 l 393 424.5 l 392.25 423.75 l 387.75 423.75 l 386.25 425.25 l 386.25 426 l h b* 399.75 426 m 399.75 426.75 l 400.5 427.5 l 402 427.5 l 402.75 428.25 l 403.5 428.25 l 404.25 427.5 l 405 427.5 l 405.75 426.75 l 406.5 426.75 l 406.5 425.25 l 405 423.75 l 400.5 423.75 l 399.75 424.5 l 399.75 426 l h b* 390 428.25 m 390 429.75 l 389.25 430.5 l 388.5 432 l 384.75 432 l 384 431.25 l 384 430.5 l 383.25 429 l 383.25 428.25 l 385.5 426 l 386.25 426 l S 401.25 427.5 m 399 429.75 l 397.5 429.75 l 396.75 430.5 l 396 429.75 l 394.5 429.75 l 392.25 427.5 l S 401.25 423.75 m 400.5 423 l 399.75 423 l 399 422.25 l 398.25 422.25 l 397.5 421.5 l 396 421.5 l 395.25 422.25 l 394.5 422.25 l 393.75 423 l 393 423 l 392.25 423.75 l S 404.25 428.25 m 405 428.25 l 405 429 l 405.75 429 l 405.75 429.75 l 406.5 430.5 l 406.5 431.25 l 404.25 433.5 l S 402 428.25 m 401.25 428.25 l 399.75 429.75 l 399.75 432 l 400.5 432.75 l 401.25 432.75 l 402 433.5 l S 323.25 411 m 324.75 412.5 l 325.5 412.5 l 326.25 413.25 l 327 413.25 l 328.5 412.5 l 329.25 412.5 l 330 411.75 l 330 409.5 l 329.25 408.75 l 324.75 408.75 l 323.25 410.25 l 323.25 411 l h b* 336.75 411 m 336.75 411.75 l 337.5 412.5 l 338.25 412.5 l 339.75 413.25 l 340.5 413.25 l 341.25 412.5 l 342 412.5 l 343.5 411 l 343.5 410.25 l 342 408.75 l 337.5 408.75 l 336.75 409.5 l 336.75 411 l h b* 327 413.25 m 327 414.75 l 324.75 417 l 322.5 417 l 321.75 416.25 l 321 416.25 l 320.25 414.75 l 320.25 413.25 l 322.5 411 l 323.25 411 l S 338.25 412.5 m 336 414.75 l 334.5 414.75 l 333.75 415.5 l 333 414.75 l 331.5 414.75 l 329.25 412.5 l S 338.25 408.75 m 337.5 408 l 336.75 408 l 335.25 406.5 l 332.25 406.5 l 330.75 408 l 330 408 l 329.25 408.75 l S 323.25 400.5 m 324 401.25 l 324 402 l 324.75 402.75 l 329.25 402.75 l 330.75 401.25 l 330.75 399.75 l 330 399.75 l 329.25 399 l 328.5 399 l 327.75 398.25 l 326.25 398.25 l 325.5 399 l 324.75 399 l 323.25 400.5 l h b* 336.75 400.5 m 336.75 401.25 l 338.25 402.75 l 342.75 402.75 l 342.75 402 l 343.5 401.25 l 343.5 399.75 l 342.75 399.75 l 342.75 399 l 342 399 l 340.5 398.25 l 339.75 398.25 l 339 399 l 338.25 399 l 337.5 399.75 l 336.75 399.75 l 336.75 400.5 l h b* 327 402.75 m 327 405 l 324.75 407.25 l 323.25 407.25 l 321 405 l 321 402 l 321.75 401.25 l 322.5 401.25 l 323.25 400.5 l S 338.25 402.75 m 336.75 404.25 l 336 404.25 l 335.25 405 l 332.25 405 l 331.5 404.25 l 330.75 404.25 l 329.25 402.75 l S 338.25 399 m 336.75 397.5 l 336 397.5 l 335.25 396.75 l 332.25 396.75 l 331.5 397.5 l 330.75 397.5 l 329.25 399 l S 341.25 402.75 m 342 403.5 l 342.75 403.5 l 342.75 404.25 l 343.5 405 l 343.5 406.5 l 341.25 408.75 l S 339 402.75 m 336.75 405 l 336.75 406.5 l 337.5 407.25 l 337.5 408 l 338.25 408 l 339 408.75 l S 323.25 360.75 m 323.25 361.5 l 324 361.5 l 325.5 363 l 328.5 363 l 330 361.5 l 330 359.25 l 329.25 359.25 l 328.5 358.5 l 325.5 358.5 l 324.75 359.25 l 324 359.25 l 323.25 360 l 323.25 360.75 l h b* 336.75 360.75 m 336.75 361.5 l 338.25 363 l 341.25 363 l 342.75 361.5 l 343.5 361.5 l 343.5 360 l 342.75 359.25 l 342 359.25 l 341.25 358.5 l 338.25 358.5 l 337.5 359.25 l 336.75 359.25 l 336.75 360.75 l h b* 327 363 m 327 364.5 l 326.25 366 l 325.5 366.75 l 321.75 366.75 l 320.25 365.25 l 320.25 363 l 322.5 360.75 l 323.25 360.75 l S 338.25 363 m 337.5 363.75 l 336.75 363.75 l 336 364.5 l 335.25 364.5 l 334.5 365.25 l 333 365.25 l 332.25 364.5 l 331.5 364.5 l 330.75 363.75 l 330 363.75 l 329.25 363 l S 338.25 358.5 m 337.5 358.5 l 336 357 l 335.25 357 l 334.5 356.25 l 333 356.25 l 332.25 357 l 331.5 357 l 330 358.5 l 329.25 358.5 l S 323.25 351 m 324 351 l 324 351.75 l 324.75 352.5 l 325.5 352.5 l 326.25 353.25 l 327.75 353.25 l 328.5 352.5 l 329.25 352.5 l 330.75 351 l 330.75 350.25 l 329.25 348.75 l 324.75 348.75 l 324 349.5 l 324 350.25 l 323.25 351 l h b* 336.75 351 m 338.25 352.5 l 339 352.5 l 339.75 353.25 l 340.5 353.25 l 342 352.5 l 342.75 352.5 l 342.75 351.75 l 343.5 351 l 343.5 350.25 l 342.75 349.5 l 342.75 348.75 l 338.25 348.75 l 336.75 350.25 l 336.75 351 l h b* 327 353.25 m 327 354.75 l 324.75 357 l 323.25 357 l 322.5 356.25 l 321.75 356.25 l 321 354.75 l 321 352.5 l 322.5 351 l 323.25 351 l S 338.25 352.5 m 336 354.75 l 331.5 354.75 l 329.25 352.5 l S 338.25 348.75 m 337.5 348 l 336.75 348 l 335.25 346.5 l 332.25 346.5 l 330.75 348 l 330 348 l 329.25 348.75 l S 341.25 352.5 m 343.5 354.75 l 343.5 356.25 l 342.75 357 l 342.75 357.75 l 342 357.75 l 341.25 358.5 l S 339 352.5 m 337.5 354 l 337.5 354.75 l 336.75 354.75 l 336.75 357 l 337.5 357 l 337.5 357.75 l 338.25 357.75 l 339 358.5 l S BT 324 377.25 TD 0 0 0 rg (...) Tj 66 -3.75 TD (...) Tj 0.7529 0.7529 0.7529 rg ET 386.25 360.75 m 386.25 361.5 l 387 361.5 l 387 362.25 l 388.5 363 l 390.75 363 l 392.25 361.5 l 393 361.5 l 393 360 l 392.25 359.25 l 391.5 359.25 l 390.75 358.5 l 388.5 358.5 l 387 359.25 l 386.25 360 l 386.25 360.75 l h b* 399 360.75 m 401.25 363 l 404.25 363 l 406.5 360.75 l 405.75 360 l 405.75 359.25 l 405 359.25 l 404.25 358.5 l 401.25 358.5 l 400.5 359.25 l 399.75 359.25 l 399.75 360 l 399 360.75 l h b* 389.25 363 m 390 363.75 l 389.25 364.5 l 389.25 366 l 388.5 366.75 l 384.75 366.75 l 383.25 365.25 l 383.25 363 l 384 362.25 l 384 361.5 l 385.5 360.75 l 386.25 360.75 l S 400.5 363 m 399.75 363.75 l 399 363.75 l 398.25 364.5 l 397.5 364.5 l 396.75 365.25 l 395.25 365.25 l 394.5 364.5 l 393.75 364.5 l 393.75 363.75 l 393 363.75 l 392.25 363 l S 400.5 358.5 m 399.75 358.5 l 398.25 357 l 397.5 357 l 396.75 356.25 l 395.25 356.25 l 394.5 357 l 393.75 357 l 393.75 357.75 l 393 358.5 l 392.25 358.5 l S 386.25 351 m 387.75 352.5 l 388.5 352.5 l 389.25 353.25 l 390 353.25 l 391.5 352.5 l 392.25 352.5 l 393 351.75 l 393 349.5 l 392.25 348.75 l 387.75 348.75 l 386.25 350.25 l 386.25 351 l h b* 399.75 351 m 399.75 351.75 l 400.5 352.5 l 402 352.5 l 402.75 353.25 l 403.5 353.25 l 404.25 352.5 l 405 352.5 l 406.5 351 l 406.5 350.25 l 405 348.75 l 400.5 348.75 l 399.75 349.5 l 399.75 351 l h b* 390 353.25 m 390 354.75 l 387.75 357 l 385.5 357 l 384.75 356.25 l 384 356.25 l 384 354.75 l 383.25 354 l 383.25 353.25 l 385.5 351 l 386.25 351 l S 401.25 352.5 m 399 354.75 l 394.5 354.75 l 392.25 352.5 l S 401.25 348.75 m 400.5 348 l 399.75 348 l 398.25 346.5 l 395.25 346.5 l 393.75 348 l 393 348 l 392.25 348.75 l S 404.25 352.5 m 405 353.25 l 405 354 l 405.75 354 l 405.75 354.75 l 406.5 355.5 l 406.5 356.25 l 405.75 356.25 l 405.75 357 l 404.25 358.5 l S 402 352.5 m 399.75 354.75 l 399.75 357 l 400.5 357.75 l 401.25 357.75 l 402 358.5 l S BT 358.5 351.75 TD 0 0 0 rg (...) Tj 0.7529 0.7529 0.7529 rg ET 386.25 411 m 387 411.75 l 387 412.5 l 388.5 412.5 l 389.25 413.25 l 390 413.25 l 390.75 412.5 l 391.5 412.5 l 393 411 l 393 410.25 l 391.5 408.75 l 387 408.75 l 387 409.5 l 386.25 410.25 l 386.25 411 l h b* 399 411 m 399.75 411 l 399.75 411.75 l 400.5 412.5 l 401.25 412.5 l 402 413.25 l 403.5 413.25 l 404.25 412.5 l 405 412.5 l 405.75 411.75 l 405.75 411 l 406.5 411 l 405.75 410.25 l 405.75 409.5 l 405 408.75 l 400.5 408.75 l 399.75 409.5 l 399.75 410.25 l 399 411 l h b* 389.25 413.25 m 390 414 l 389.25 414.75 l 389.25 415.5 l 387.75 417 l 385.5 417 l 384.75 416.25 l 384 416.25 l 383.25 414.75 l 383.25 413.25 l 384 412.5 l 384 411.75 l 385.5 411 l 386.25 411 l S 400.5 412.5 m 398.25 414.75 l 396.75 414.75 l 396 415.5 l 395.25 414.75 l 393.75 414.75 l 393.75 414 l 392.25 412.5 l S 400.5 408.75 m 399.75 408 l 399 408 l 397.5 406.5 l 394.5 406.5 l 393.75 407.25 l 393.75 408 l 393 408 l 392.25 408.75 l S 386.25 400.5 m 386.25 401.25 l 387.75 402.75 l 392.25 402.75 l 393 402 l 393 399.75 l 392.25 399 l 391.5 399 l 390 398.25 l 389.25 398.25 l 388.5 399 l 387.75 399 l 387 399.75 l 386.25 399.75 l 386.25 400.5 l h b* 399.75 400.5 m 399.75 402 l 400.5 402.75 l 405 402.75 l 406.5 401.25 l 406.5 399.75 l 405.75 399.75 l 405 399 l 404.25 399 l 403.5 398.25 l 402.75 398.25 l 402 399 l 400.5 399 l 399.75 399.75 l 399.75 400.5 l h b* 390 402.75 m 390 405 l 387.75 407.25 l 385.5 407.25 l 384 405.75 l 384 405 l 383.25 404.25 l 383.25 403.5 l 384 402 l 384.75 401.25 l 385.5 401.25 l 386.25 400.5 l S 401.25 402.75 m 399.75 404.25 l 399 404.25 l 398.25 405 l 395.25 405 l 394.5 404.25 l 393.75 404.25 l 392.25 402.75 l S 401.25 399 m 399.75 397.5 l 399 397.5 l 398.25 396.75 l 395.25 396.75 l 394.5 397.5 l 393.75 397.5 l 392.25 399 l S 404.25 402.75 m 405.75 404.25 l 405.75 405 l 406.5 405 l 406.5 405.75 l 405.75 406.5 l 405.75 407.25 l 405 407.25 l 405 408 l 404.25 408.75 l S 402 402.75 m 400.5 404.25 l 399.75 404.25 l 399.75 407.25 l 400.5 408 l 401.25 408 l 402 408.75 l S 1 1 1 rg 466.5 394.5 m 459.75 388.5 l 459.75 392.25 l 412.5 392.25 l 412.5 396.75 l 459.75 396.75 l 459.75 400.5 l 466.5 394.5 l h b* BT 470.25 340.5 TD 0 0 0 rg /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -171 -22.5 TD /F3 11.25 Tf -0.3725 Tc 0 Tw (\(a\)) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg ET 140.25 157.5 m 237 157.5 l 237 258 l 140.25 258 l 140.25 157.5 l h b* 157.5 279.75 m 245.25 279.75 l 245.25 295.5 l 157.5 295.5 l 157.5 279.75 l h b* BT 1.0055 0 0 1 168.75 285 Tm 0 0 0 rg /F5 8.3333 Tf 0.1602 Tc -0.2371 Tw (Branch Address) Tj 1 1 1 rg ET 254.25 195 m 301.5 195 l 301.5 220.5 l 254.25 220.5 l 254.25 195 l h b* BT 1.0055 0 0 1 257.25 210 Tm 0 0 0 rg 0.2695 Tc 0 Tw (Index) Tj 0 -9.75 TD -0.0158 Tc (Function) Tj ET 301.5 207.75 m 317.25 207.75 l S 316.5 210 m 323.25 207.75 l 316.5 205.5 l 316.5 210 l h f* 188.25 279.75 m 188.25 264 l S 190.5 264.75 m 188.25 258 l 186 264.75 l 190.5 264.75 l h f* 237 207.75 m 248.25 207.75 l S 247.5 210 m 254.25 207.75 l 247.5 205.5 l 247.5 210 l h f* 1 1 1 rg 323.25 157.5 m 420.75 157.5 l 420.75 258 l 323.25 258 l 323.25 157.5 l h b* BT 1.0055 0 0 1 324 264.75 Tm 0 0 0 rg 0.0245 Tc -0.1028 Tw (Branch Prediction Table) Tj 108.8968 -39 TD 0.1672 Tc 0 Tw (Outcome) Tj -2.2376 -9.75 TD 0.1013 Tc (Prediction) Tj 0.7529 0.7529 0.7529 rg ET 331.5 249 m 331.5 250.5 l 332.25 250.5 l 333 251.25 l 336 251.25 l 336.75 250.5 l 337.5 250.5 l 337.5 249.75 l 338.25 249 l 337.5 248.25 l 337.5 247.5 l 336.75 247.5 l 336 246.75 l 333 246.75 l 332.25 247.5 l 331.5 247.5 l 331.5 249 l h b* 344.25 249 m 344.25 249.75 l 345 250.5 l 345.75 250.5 l 346.5 251.25 l 349.5 251.25 l 350.25 250.5 l 351 250.5 l 351 247.5 l 350.25 247.5 l 349.5 246.75 l 346.5 246.75 l 345.75 247.5 l 345 247.5 l 344.25 248.25 l 344.25 249 l h b* 334.5 251.25 m 334.5 253.5 l 333.75 254.25 l 333.75 255 l 330 255 l 328.5 253.5 l 328.5 250.5 l 330 249 l 331.5 249 l S 345.75 251.25 m 345 252 l 344.25 252 l 343.5 252.75 l 342.75 252.75 l 342 253.5 l 340.5 253.5 l 339.75 252.75 l 339 252.75 l 338.25 252 l 337.5 252 l 336.75 251.25 l S 345.75 247.5 m 343.5 245.25 l 342 245.25 l 341.25 244.5 l 340.5 245.25 l 339 245.25 l 336.75 247.5 l S 331.5 239.25 m 332.25 240 l 332.25 240.75 l 333.75 240.75 l 334.5 241.5 l 335.25 241.5 l 336 240.75 l 336.75 240.75 l 338.25 239.25 l 338.25 238.5 l 336.75 237 l 332.25 237 l 332.25 237.75 l 331.5 238.5 l 331.5 239.25 l h b* 344.25 239.25 m 345 239.25 l 345 240 l 345.75 240.75 l 346.5 240.75 l 347.25 241.5 l 348.75 241.5 l 349.5 240.75 l 350.25 240.75 l 351 240 l 351 239.25 l 351.75 239.25 l 351 238.5 l 351 237.75 l 350.25 237 l 345.75 237 l 345 237.75 l 345 238.5 l 344.25 239.25 l h b* 334.5 241.5 m 335.25 242.25 l 334.5 243 l 334.5 243.75 l 333 245.25 l 330 245.25 l 329.25 244.5 l 328.5 243 l 328.5 241.5 l 329.25 240.75 l 329.25 240 l 330.75 239.25 l 331.5 239.25 l S 345.75 240.75 m 343.5 243 l 342 243 l 341.25 243.75 l 340.5 243 l 339 243 l 338.25 242.25 l 338.25 241.5 l 337.5 240.75 l S 345.75 237 m 345 236.25 l 344.25 236.25 l 343.5 235.5 l 342.75 235.5 l 342 234.75 l 340.5 234.75 l 339.75 235.5 l 339 235.5 l 337.5 237 l S 348.75 241.5 m 349.5 241.5 l 350.25 242.25 l 351 242.25 l 351 245.25 l 350.25 246 l 349.5 246 l 348.75 246.75 l S 346.5 241.5 m 345.75 241.5 l 345.75 242.25 l 344.25 243.75 l 344.25 244.5 l 346.5 246.75 l S BT 0.9992 0 0 1 366 237 Tm 0 0 0 rg /F4 16.6667 Tf -0.1299 Tc (...) Tj 0.7529 0.7529 0.7529 rg ET 393.75 249 m 393.75 249.75 l 394.5 250.5 l 395.25 250.5 l 396 251.25 l 399 251.25 l 400.5 249.75 l 400.5 248.25 l 399 246.75 l 396 246.75 l 395.25 247.5 l 394.5 247.5 l 393.75 248.25 l 393.75 249 l h b* 407.25 249 m 407.25 250.5 l 408 250.5 l 408.75 251.25 l 411.75 251.25 l 412.5 250.5 l 413.25 250.5 l 414 249.75 l 414 248.25 l 413.25 247.5 l 412.5 247.5 l 411.75 246.75 l 408.75 246.75 l 408 247.5 l 407.25 247.5 l 407.25 249 l h b* 397.5 251.25 m 397.5 253.5 l 396 255 l 392.25 255 l 390.75 253.5 l 390.75 251.25 l 393 249 l 393.75 249 l S 408 251.25 m 407.25 252 l 406.5 252 l 405 253.5 l 403.5 253.5 l 402.75 252.75 l 402 252.75 l 401.25 252 l 400.5 252 l 399.75 251.25 l S 408 247.5 m 405.75 245.25 l 405 245.25 l 404.25 244.5 l 403.5 245.25 l 402 245.25 l 399.75 247.5 l S 393.75 239.25 m 394.5 239.25 l 394.5 240 l 395.25 240.75 l 396 240.75 l 396.75 241.5 l 398.25 241.5 l 399 240.75 l 399.75 240.75 l 400.5 240 l 400.5 239.25 l 401.25 239.25 l 400.5 238.5 l 400.5 237.75 l 399.75 237 l 395.25 237 l 394.5 237.75 l 394.5 238.5 l 393.75 239.25 l h b* 407.25 239.25 m 408.75 240.75 l 409.5 240.75 l 410.25 241.5 l 411 241.5 l 411.75 240.75 l 412.5 240.75 l 414 239.25 l 414 238.5 l 412.5 237 l 408.75 237 l 407.25 238.5 l 407.25 239.25 l h b* 397.5 241.5 m 397.5 243 l 395.25 245.25 l 392.25 245.25 l 391.5 244.5 l 391.5 243 l 390.75 242.25 l 391.5 241.5 l 391.5 240.75 l 393 239.25 l 393.75 239.25 l S 408.75 240.75 m 406.5 243 l 405 243 l 404.25 243.75 l 403.5 243 l 402 243 l 399.75 240.75 l S 408.75 237 m 408 236.25 l 407.25 236.25 l 406.5 235.5 l 405.75 235.5 l 405 234.75 l 403.5 234.75 l 402.75 235.5 l 402 235.5 l 401.25 236.25 l 400.5 236.25 l 399.75 237 l S 411.75 241.5 m 412.5 241.5 l 414 243 l 414 244.5 l 413.25 245.25 l 413.25 246 l 412.5 246 l 411.75 246.75 l S 409.5 241.5 m 408.75 241.5 l 408 242.25 l 408 243 l 407.25 243 l 407.25 245.25 l 408 245.25 l 408 246 l 408.75 246 l 409.5 246.75 l S 331.5 224.25 m 331.5 225 l 332.25 225.75 l 333 225.75 l 333.75 226.5 l 335.25 226.5 l 336 225.75 l 336.75 225.75 l 337.5 225 l 337.5 224.25 l 338.25 224.25 l 337.5 223.5 l 337.5 222.75 l 336.75 222 l 332.25 222 l 331.5 222.75 l 331.5 224.25 l h b* 344.25 224.25 m 345.75 225.75 l 346.5 225.75 l 347.25 226.5 l 348 226.5 l 349.5 225.75 l 350.25 225.75 l 351 225 l 351 222.75 l 350.25 222 l 345.75 222 l 344.25 223.5 l 344.25 224.25 l h b* 334.5 226.5 m 334.5 228 l 333.75 228.75 l 333.75 229.5 l 332.25 230.25 l 330.75 230.25 l 330 229.5 l 329.25 229.5 l 328.5 228 l 328.5 225.75 l 330 224.25 l 331.5 224.25 l S 345.75 225.75 m 343.5 228 l 339 228 l 336.75 225.75 l S 345.75 222 m 344.25 220.5 l 343.5 220.5 l 342.75 219.75 l 339.75 219.75 l 339 220.5 l 338.25 220.5 l 336.75 222 l S 331.5 213.75 m 331.5 214.5 l 332.25 215.25 l 332.25 216 l 336.75 216 l 338.25 214.5 l 338.25 213 l 337.5 213 l 336.75 212.25 l 336 212.25 l 335.25 211.5 l 334.5 211.5 l 333.75 212.25 l 332.25 212.25 l 332.25 213 l 331.5 213 l 331.5 213.75 l h b* 344.25 213.75 m 345 214.5 l 345 215.25 l 345.75 216 l 350.25 216 l 351 215.25 l 351 214.5 l 351.75 213.75 l 350.25 212.25 l 349.5 212.25 l 348.75 211.5 l 347.25 211.5 l 346.5 212.25 l 345.75 212.25 l 344.25 213.75 l h b* 334.5 216 m 335.25 217.5 l 334.5 218.25 l 334.5 219 l 333 220.5 l 330.75 220.5 l 328.5 218.25 l 328.5 216.75 l 329.25 215.25 l 329.25 214.5 l 330.75 214.5 l 331.5 213.75 l S 345.75 216 m 344.25 217.5 l 343.5 217.5 l 342.75 218.25 l 339.75 218.25 l 339 217.5 l 338.25 217.5 l 338.25 216.75 l 337.5 216 l S 345.75 212.25 m 344.25 210.75 l 343.5 210.75 l 342.75 210 l 339.75 210 l 339 210.75 l 338.25 210.75 l 338.25 211.5 l 337.5 212.25 l S 348.75 216 m 349.5 216.75 l 350.25 216.75 l 351 217.5 l 351 220.5 l 350.25 220.5 l 349.5 221.25 l 348.75 221.25 l S 346.5 216 m 345.75 216.75 l 345.75 217.5 l 345 217.5 l 345 218.25 l 344.25 219 l 344.25 219.75 l 345 219.75 l 345 220.5 l 345.75 220.5 l 345.75 221.25 l 346.5 221.25 l S 331.5 174 m 331.5 175.5 l 332.25 175.5 l 333 176.25 l 336 176.25 l 336.75 175.5 l 337.5 175.5 l 337.5 174.75 l 338.25 174 l 337.5 173.25 l 337.5 172.5 l 336.75 172.5 l 336 171.75 l 333 171.75 l 332.25 172.5 l 331.5 172.5 l 331.5 174 l h b* 344.25 174 m 344.25 174.75 l 345 175.5 l 345.75 175.5 l 346.5 176.25 l 349.5 176.25 l 350.25 175.5 l 351 175.5 l 351 172.5 l 350.25 172.5 l 349.5 171.75 l 346.5 171.75 l 345.75 172.5 l 345 172.5 l 344.25 173.25 l 344.25 174 l h b* 334.5 176.25 m 334.5 178.5 l 333.75 179.25 l 333.75 180 l 330 180 l 328.5 178.5 l 328.5 175.5 l 330 174 l 331.5 174 l S 345.75 176.25 m 345 177 l 344.25 177 l 343.5 177.75 l 342.75 177.75 l 342 178.5 l 340.5 178.5 l 339.75 177.75 l 339 177.75 l 338.25 177 l 337.5 177 l 336.75 176.25 l S 345.75 172.5 m 343.5 170.25 l 342 170.25 l 341.25 169.5 l 340.5 170.25 l 339 170.25 l 336.75 172.5 l S 331.5 164.25 m 332.25 165 l 332.25 165.75 l 333.75 165.75 l 334.5 166.5 l 335.25 166.5 l 336 165.75 l 336.75 165.75 l 338.25 164.25 l 338.25 163.5 l 336.75 162 l 332.25 162 l 332.25 162.75 l 331.5 163.5 l 331.5 164.25 l h b* 344.25 164.25 m 345 164.25 l 345 165 l 345.75 165.75 l 346.5 165.75 l 347.25 166.5 l 348.75 166.5 l 349.5 165.75 l 350.25 165.75 l 351 165 l 351 164.25 l 351.75 164.25 l 351 163.5 l 351 162.75 l 350.25 162 l 345.75 162 l 345 162.75 l 345 163.5 l 344.25 164.25 l h b* 334.5 166.5 m 335.25 167.25 l 334.5 168 l 334.5 168.75 l 333 170.25 l 330 170.25 l 329.25 169.5 l 328.5 168 l 328.5 166.5 l 329.25 165.75 l 329.25 165 l 330.75 164.25 l 331.5 164.25 l S 345.75 165.75 m 343.5 168 l 342 168 l 341.25 168.75 l 340.5 168 l 339 168 l 338.25 167.25 l 338.25 166.5 l 337.5 165.75 l S 345.75 162 m 345 161.25 l 344.25 161.25 l 343.5 160.5 l 342.75 160.5 l 342 159.75 l 340.5 159.75 l 339.75 160.5 l 339 160.5 l 337.5 162 l S 348.75 166.5 m 349.5 166.5 l 350.25 167.25 l 351 167.25 l 351 170.25 l 350.25 171 l 349.5 171 l 348.75 171.75 l S 346.5 166.5 m 345.75 166.5 l 345.75 167.25 l 344.25 168.75 l 344.25 169.5 l 346.5 171.75 l S BT 0.9992 0 0 1 331.5 190.5 Tm 0 0 0 rg (...) Tj 66.0504 -3.75 TD (...) Tj 0.7529 0.7529 0.7529 rg ET 393.75 174 m 393.75 174.75 l 394.5 175.5 l 395.25 175.5 l 396 176.25 l 399 176.25 l 400.5 174.75 l 400.5 173.25 l 399 171.75 l 396 171.75 l 395.25 172.5 l 394.5 172.5 l 393.75 173.25 l 393.75 174 l h b* 407.25 174 m 407.25 175.5 l 408 175.5 l 408.75 176.25 l 411.75 176.25 l 412.5 175.5 l 413.25 175.5 l 414 174.75 l 414 173.25 l 413.25 172.5 l 412.5 172.5 l 411.75 171.75 l 408.75 171.75 l 408 172.5 l 407.25 172.5 l 407.25 174 l h b* 397.5 176.25 m 397.5 178.5 l 396 180 l 392.25 180 l 390.75 178.5 l 390.75 176.25 l 393 174 l 393.75 174 l S 408 176.25 m 407.25 177 l 406.5 177 l 405 178.5 l 403.5 178.5 l 402.75 177.75 l 402 177.75 l 401.25 177 l 400.5 177 l 399.75 176.25 l S 408 172.5 m 405.75 170.25 l 405 170.25 l 404.25 169.5 l 403.5 170.25 l 402 170.25 l 399.75 172.5 l S 393.75 164.25 m 394.5 164.25 l 394.5 165 l 395.25 165.75 l 396 165.75 l 396.75 166.5 l 398.25 166.5 l 399 165.75 l 399.75 165.75 l 400.5 165 l 400.5 164.25 l 401.25 164.25 l 400.5 163.5 l 400.5 162.75 l 399.75 162 l 395.25 162 l 394.5 162.75 l 394.5 163.5 l 393.75 164.25 l h b* 407.25 164.25 m 408.75 165.75 l 409.5 165.75 l 410.25 166.5 l 411 166.5 l 411.75 165.75 l 412.5 165.75 l 414 164.25 l 414 163.5 l 412.5 162 l 408.75 162 l 407.25 163.5 l 407.25 164.25 l h b* 397.5 166.5 m 397.5 168 l 395.25 170.25 l 392.25 170.25 l 391.5 169.5 l 391.5 168 l 390.75 167.25 l 391.5 166.5 l 391.5 165.75 l 393 164.25 l 393.75 164.25 l S 408.75 165.75 m 406.5 168 l 405 168 l 404.25 168.75 l 403.5 168 l 402 168 l 399.75 165.75 l S 408.75 162 m 408 161.25 l 407.25 161.25 l 406.5 160.5 l 405.75 160.5 l 405 159.75 l 403.5 159.75 l 402.75 160.5 l 402 160.5 l 401.25 161.25 l 400.5 161.25 l 399.75 162 l S 411.75 166.5 m 412.5 166.5 l 414 168 l 414 169.5 l 413.25 170.25 l 413.25 171 l 412.5 171 l 411.75 171.75 l S 409.5 166.5 m 408.75 166.5 l 408 167.25 l 408 168 l 407.25 168 l 407.25 170.25 l 408 170.25 l 408 171 l 408.75 171 l 409.5 171.75 l S BT 0.9992 0 0 1 366 165 Tm 0 0 0 rg (...) Tj 0.7529 0.7529 0.7529 rg ET 393.75 224.25 m 395.25 225.75 l 396 225.75 l 396.75 226.5 l 397.5 226.5 l 399 225.75 l 399.75 225.75 l 399.75 225 l 400.5 224.25 l 400.5 223.5 l 399.75 222.75 l 399.75 222 l 395.25 222 l 393.75 223.5 l 393.75 224.25 l h b* 407.25 224.25 m 407.25 225 l 408 225.75 l 408.75 225.75 l 409.5 226.5 l 411 226.5 l 411.75 225.75 l 412.5 225.75 l 414 224.25 l 414 223.5 l 412.5 222 l 408 222 l 407.25 222.75 l 407.25 224.25 l h b* 397.5 226.5 m 397.5 228 l 395.25 230.25 l 393 230.25 l 392.25 229.5 l 391.5 229.5 l 390.75 228 l 390.75 226.5 l 393 224.25 l 393.75 224.25 l S 408 225.75 m 405.75 228 l 402 228 l 399.75 225.75 l S 408 222 m 406.5 220.5 l 405.75 220.5 l 405.75 219.75 l 402.75 219.75 l 402 220.5 l 401.25 220.5 l 399.75 222 l S 393.75 213.75 m 394.5 214.5 l 394.5 215.25 l 395.25 216 l 399.75 216 l 400.5 215.25 l 400.5 214.5 l 401.25 213.75 l 399.75 212.25 l 399 212.25 l 398.25 211.5 l 396.75 211.5 l 396 212.25 l 395.25 212.25 l 393.75 213.75 l h b* 407.25 213.75 m 407.25 214.5 l 408.75 216 l 412.5 216 l 414 214.5 l 414 213 l 413.25 213 l 412.5 212.25 l 411.75 212.25 l 411 211.5 l 410.25 211.5 l 409.5 212.25 l 408.75 212.25 l 408 213 l 407.25 213 l 407.25 213.75 l h b* 397.5 216 m 397.5 218.25 l 395.25 220.5 l 393.75 220.5 l 392.25 219.75 l 391.5 219 l 391.5 218.25 l 390.75 217.5 l 391.5 216.75 l 391.5 215.25 l 392.25 214.5 l 393 214.5 l 393.75 213.75 l S 408.75 216 m 407.25 217.5 l 406.5 217.5 l 405.75 218.25 l 402.75 218.25 l 402 217.5 l 401.25 217.5 l 399.75 216 l S 408.75 212.25 m 407.25 210.75 l 406.5 210.75 l 405.75 210 l 402.75 210 l 402 210.75 l 401.25 210.75 l 399.75 212.25 l S 411.75 216 m 412.5 216.75 l 413.25 216.75 l 413.25 217.5 l 414 218.25 l 414 219.75 l 412.5 221.25 l 411.75 221.25 l S 409.5 216 m 407.25 218.25 l 407.25 219.75 l 408.75 221.25 l 409.5 221.25 l S 213.75 270 m 277.5 270 l 277.5 220.5 l S 213.75 279.75 m 213.75 270 l S 1 1 1 rg 144.75 242.25 m 172.5 242.25 l 172.5 254.25 l 144.75 254.25 l 144.75 242.25 l h b* BT 0.9992 0 0 1 150 246 Tm 0 0 0 rg /F4 8.3333 Tf -0.6099 Tc (BHR) Tj 1 1 1 rg ET 144.75 220.5 m 172.5 220.5 l 172.5 232.5 l 144.75 232.5 l 144.75 220.5 l h b* BT 0.9992 0 0 1 150 224.25 Tm 0 0 0 rg (BHR) Tj 1 1 1 rg ET 144.75 164.25 m 172.5 164.25 l 172.5 176.25 l 144.75 176.25 l 144.75 164.25 l h b* BT 0.9992 0 0 1 150 168 Tm 0 0 0 rg (BHR) Tj 31.524 75.75 TD /F4 16.6667 Tf -0.1299 Tc (...) Tj 1 1 1 rg ET 204 242.25 m 232.5 242.25 l 232.5 254.25 l 204 254.25 l 204 242.25 l h b* BT 0.9992 0 0 1 209.25 246 Tm 0 0 0 rg /F4 8.3333 Tf -0.6099 Tc (BHR) Tj 1 1 1 rg ET 204 220.5 m 232.5 220.5 l 232.5 232.5 l 204 232.5 l 204 220.5 l h b* BT 0.9992 0 0 1 209.25 224.25 Tm 0 0 0 rg (BHR) Tj -56.2929 -30 TD /F4 16.6667 Tf -0.1299 Tc (...) Tj 57.0435 0 TD (...) Tj -28.5217 -29.25 TD (...) Tj 1 1 1 rg ET 204 164.25 m 232.5 164.25 l 232.5 176.25 l 204 176.25 l 204 164.25 l h b* BT 0.9992 0 0 1 209.25 168 Tm 0 0 0 rg /F4 8.3333 Tf -0.6099 Tc (BHR) Tj ET BT 1.0055 0 0 1 159.75 142.5 Tm /F5 8.3333 Tf -0.2035 Tc 0.1226 Tw (Local BHR Table) Tj 1 1 1 rg ET 473.25 207.75 m 467.25 201.75 l 467.25 205.5 l 420.75 205.5 l 420.75 210 l 467.25 210 l 467.25 213.75 l 473.25 207.75 l h b* BT 477.75 138 TD 0 0 0 rg /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -178.5 -28.5 TD /F3 11.25 Tf -0.0825 Tc 0 Tw (\(b\)) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -148.5 -25.5 TD -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD -0.1851 Tc 0.3726 Tw ( Global \(a\) and local \(b\) two) Tj 132 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1691 Tc 0.3566 Tw (level branch predictor.) Tj 107.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 34 0 obj 32022 endobj 32 0 obj << /Type /Page /Parent 5 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F5 30 0 R >> /ProcSet 2 0 R >> /Contents 33 0 R >> endobj 37 0 obj << /Length 38 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1666 Tc 1.1041 Tw (In order to reduce the number of mispredictions, branch predictors are getting larger and more ) Tj -12 -25.5 TD -0.1264 Tc 3.3139 Tw (complex. However, code optimizations are still vital for proces) Tj 302.25 0 TD -0.1105 Tc 2.548 Tw (sor performance, since large ) Tj -302.25 -24.75 TD -0.1057 Tc 0.2932 Tw (number of pipeline stages and superscalar fetch/decode make modern processors more sensitive to ) Tj 0 -25.5 TD -0.1393 Tc 0.3268 Tw (branch mispredictions. ) Tj 102.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -102.75 -40.5 TD /F3 14.25 Tf -0.1484 Tc 0.1859 Tw (Examples of Branch Optimization by Architecture) Tj 302.25 0 TD -0.2453 Tc 0 Tw (-) Tj 4.5 0 TD 0.0085 Tc -0.571 Tw (Aware Compiler) Tj 100.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -395.25 -33 TD /F0 11.25 Tf -0.0914 Tc 2.1539 Tw (The following three examples illustrate how ) Tj 213 0 TD -0.0983 Tc 2.2358 Tw (knowledge about underlying branch predictor ) Tj -225 -25.5 TD -0.1116 Tc 0.2414 Tw (structure can improve code optimization. The first example deals with processor architectures with ) Tj 0 -25.5 TD -0.1485 Tc 2.6932 Tw (global branch predictors, and it is inspired by code generation guidelines explained in Sun\222s ) Tj 0 -24.75 TD /F2 11.25 Tf 0.0448 Tc 1.6427 Tw (UltraSparc User\222s M) Tj 98.25 0 TD 0.1245 Tc 0 Tw (anual) Tj 24.75 0 TD /F0 11.25 Tf 2.2538 Tc 0.1837 Tw ( [) Tj 9 0 TD -0.375 Tc 0 Tw (14) Tj 10.5 0 TD -0.1639 Tc 2.3764 Tw (]. The second example shows a possible optimization for local ) Tj -142.5 -25.5 TD -0.0548 Tc 1.7864 Tw (branch predictors, and it is based on hints given in one of the Intel Pentium III optimization ) Tj 0 -25.5 TD -0.1132 Tc 0.3007 Tw (guidelines [) Tj 51.75 0 TD -0.375 Tc 0 Tw (15) Tj 10.5 0 TD -0.1207 Tc 0.3618 Tw (]. Finally, the last example shows how knowledge about the size and organization of ) Tj -62.25 -25.5 TD -0.1004 Tc 3.3561 Tw (the branch predictor structure can reduce branch interference. Actual implementation of an) Tj 0 Tc 0.1875 Tw ( ) Tj 0 -24.75 TD -0.05 Tc 0 Tw (architecture) Tj 52.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.0778 Tc 0.4699 Tw (aware compiler, which is outside the scope of this paper, must al) Tj 289.5 0 TD -0.0139 Tc 0.5764 Tw (so take into account ) Tj -345.75 -25.5 TD -0.0724 Tc 0.3599 Tw (other performance factors, such as possible cache miss increases due to changes in executed code ) Tj 0 -25.5 TD -0.1339 Tc 0.3214 Tw (length. ) Tj 33 0 TD 0 Tc 0.1875 Tw ( ) Tj -21 -24.75 TD -0.1401 Tc 1.0199 Tw (Let us first consider a processor with a global branch predictor that uses ) Tj 330 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.1498 Tc 1.0873 Tw ( global history bits. ) Tj -349.5 -25.5 TD -0.1239 Tc 2.8614 Tw (This predictor is able to corre) Tj 143.25 0 TD -0.0848 Tc 2.5973 Tw (ctly predict the outcome of a branch correlated with up to) Tj 0 Tc -0.5625 Tw ( ) Tj 285 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 8.25 0 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -436.5 -25.5 TD -0.1111 Tc 0.2986 Tw (previously executed branches, while correlations longer than ) Tj 272.25 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.074 Tc 0.2615 Tw ( cannot be detected. If the outcome ) Tj -279.75 -25.5 TD -0.1096 Tc 0.5783 Tw (of a particular branch depends on more than ) Tj 201 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.0295 Tc 0.3003 Tw ( previous branches, the compiler can split the code ) Tj -208.5 -24.75 TD -0.1352 Tc 0.3227 Tw (and duplicate branches as necessary, replacing a long branch correlation with several shorter ones. ) Tj 436.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -424.5 -25.5 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 33 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.1531 Tc 1.6999 Tw (a shows the control flow for one such scenario, where the branch D outcome is the ) Tj -50.25 -25.5 TD /F2 11.25 Tf 0 Tc 0 Tw (AND) Tj 22.5 0 TD /F0 11.25 Tf -0.0827 Tc 0.2702 Tw ( function of the outcomes) Tj 114 0 TD -0.0382 Tc 0.2834 Tw ( of two previously executed branches: A and B, or A and C \() Tj 274.5 0 TD -0.0232 Tc 0.9607 Tw (Table ) Tj -411 -25.5 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.032 Tc 1.8267 Tw (\). In this example the branch predictor uses only one bit of global history \() Tj 356.25 0 TD /F2 11.25 Tf -0.1575 Tc 0 Tw (N=1) Tj 21 0 TD /F0 11.25 Tf -0.0304 Tc 1.4679 Tw (\). Since the ) Tj -382.5 -24.75 TD -0.1174 Tc 1.2594 Tw (predictor is able to \223remember\224 only one previous branch, it cannot distingu) Tj 348.75 0 TD -0.0609 Tc 0.4984 Tw (ish between branch ) Tj -348.75 -25.5 TD -0.1312 Tc 1.7398 Tw (histories 01, when D is not taken, and 11, when D is taken. The BPT index function takes as ) Tj ET endstream endobj 38 0 obj 4948 endobj 35 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 37 0 R >> endobj 40 0 obj << /Length 41 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.145 Tc 1.7492 Tw (arguments the branch D address and one history bit, so the same BPT cell is accessed in both ) Tj 0 -25.5 TD -0.0583 Tc 0.7458 Tw (cases. Assuming a two) Tj 102.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1411 Tc 1.0786 Tw (bit saturating counter, one or) Tj 130.5 0 TD -0.1059 Tc 0.9362 Tw ( both corresponding branch D outcomes are ) Tj -237 -24.75 TD -0.0561 Tc 0.2436 Tw (mispredicted, depending on the counter start state \() Tj 228.75 0 TD -0.3232 Tc 0.5107 Tw (Table ) Tj 27 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.0073 Tc 0.1948 Tw (a\). In the other two cases ) Tj 116.25 0 TD 0.0038 Tc 0 Tw (--) Tj 7.5 0 TD -0.1292 Tc 0.0667 Tw ( i.e., branch ) Tj -384.75 -25.5 TD -0.1638 Tc 0.7263 Tw (histories 00 and 10 ) Tj 89.25 0 TD 0.0038 Tc 0 Tw (--) Tj 7.5 0 TD -0.2149 Tc 1.2149 Tw ( one history bit is enough for a correct prediction, since D outc) Tj 283.5 0 TD -0.2494 Tc 0.6869 Tw (ome is equal ) Tj -380.25 -25.5 TD -0.1338 Tc 1.607 Tw (to the outcome of the previous branch. ) Tj 183.75 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 33 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.1373 Tc 1.8248 Tw (b shows the code modified by an architecture) Tj 210.75 0 TD 0.0038 Tc 0 Tw (-) Tj -432.75 -25.5 TD -0.1149 Tc 0.937 Tw (aware compiler. Block 7 code and branch D are duplicated, to blocks 7) Tj 324.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.051 Tc 0.8865 Tw (1 and 7) Tj 33.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1281 Tc 0.8156 Tw (2, and branches ) Tj -366 -24.75 TD -0.1067 Tc 1.0443 Tw (D1 and D2: branch D) Tj 98.25 0 TD -0.1995 Tc 1.012 Tw (1 is on the not taken path of branch A, and branch D2 is on the taken path. ) Tj -98.25 -25.5 TD -0.1269 Tc 1.0644 Tw (Now the D1 outcome is always 0, and the D2 outcome is equal to the branch C outcome \() Tj 411 0 TD -0.0232 Tc 0.9607 Tw (Table ) Tj -411 -25.5 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.0962 Tc 1.7087 Tw (b\). Since branches D1 and D2 are located at different addresses) Tj 296.25 0 TD -0.1497 Tc 1.5372 Tw (, they have separate predictor ) Tj -301.5 -25.5 TD -0.0423 Tc 0.2875 Tw (entries for the same branch histories, so both are correctly predicted by separate two) Tj 378 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0881 Tc 0.2756 Tw (bit counters. ) Tj -381.75 -24.75 TD -0.0901 Tc 2.6026 Tw (In all cases we assume that this control flow is a part of a loop, and the predictor can be ) Tj 0 -25.5 TD -0.103 Tc 0.2905 Tw (dynamically \223trained.\224) Tj 100.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 41 0 obj 2688 endobj 39 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R >> /ProcSet 2 0 R >> /Contents 40 0 R >> endobj 43 0 obj << /Length 44 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG ET 172.5 683.25 m 194.25 694.5 l 216 683.25 l 194.25 672.75 l 172.5 683.25 l h b* BT 1.0085 0 0 1 180.75 681.75 Tm 0 0 0 rg /F5 5.9874 Tf -0.0579 Tc -0.1187 Tw (Branch A) Tj 1 1 1 rg ET 122.25 645.75 m 158.25 645.75 l 158.25 661.5 l 122.25 661.5 l 122.25 645.75 l h b* BT 1.0085 0 0 1 129.75 651.75 Tm 0 0 0 rg -0.1721 Tc -0.0065 Tw (Block 1) Tj 1 1 1 rg ET 230.25 648 m 266.25 648 l 266.25 664.5 l 230.25 664.5 l 230.25 648 l h b* BT 1.0085 0 0 1 237.75 654.75 Tm 0 0 0 rg (Block 2) Tj ET 172.5 683.25 m 140.25 683.25 l 140.25 665.25 l S 141.75 666 m 140.25 661.5 l 138 666 l 141.75 666 l h f* 1 1 1 rg 143.25 687.75 m 143.25 679.5 l 147.75 679.5 l 147.75 687.75 l h f* BT 1.0113 0 0 1 144 681.75 Tm 0 0 0 rg /F4 5.9874 Tf -0.3585 Tc 0 Tw (0) Tj ET 216 683.25 m 248.25 683.25 l 248.25 668.25 l S 250.5 668.25 m 248.25 664.5 l 246.75 668.25 l 250.5 668.25 l h f* 1 1 1 rg 239.25 687.75 m 239.25 679.5 l 243 679.5 l 243 687.75 l h f* BT 1.0113 0 0 1 239.25 681.75 Tm 0 0 0 rg (1) Tj ET 194.25 710.25 m 194.25 697.5 l S 192 698.25 m 194.25 694.5 l 196.5 698.25 l 192 698.25 l h f* 1 1 1 rg 118.5 621 m 140.25 632.25 l 162 621 l 140.25 610.5 l 118.5 621 l h b* BT 1.0085 0 0 1 126.75 619.5 Tm 0 0 0 rg /F5 5.9874 Tf -0.0579 Tc -0.1187 Tw (Branch B) Tj 1 1 1 rg ET 93 583.5 m 129 583.5 l 129 599.25 l 93 599.25 l 93 583.5 l h b* BT 1.0085 0 0 1 100.5 589.5 Tm 0 0 0 rg -0.1721 Tc -0.0065 Tw (Block 3) Tj 1 1 1 rg ET 150.75 583.5 m 186.75 583.5 l 186.75 599.25 l 150.75 599.25 l 150.75 583.5 l h b* BT 1.0085 0 0 1 158.25 589.5 Tm 0 0 0 rg (Block 4) Tj ET 118.5 621 m 111 621 l 111 603 l S 113.25 603.75 m 111 599.25 l 109.5 603.75 l 113.25 603.75 l h f* 1 1 1 rg 108.75 618 m 108.75 609.75 l 113.25 609.75 l 113.25 618 l h f* BT 1.0113 0 0 1 109.5 612 Tm 0 0 0 rg /F4 5.9874 Tf -0.3585 Tc 0 Tw (0) Tj ET 162 621 m 168.75 621 l 168.75 603 l S 171 603.75 m 168.75 599.25 l 167.25 603.75 l 171 603.75 l h f* 1 1 1 rg 167.25 618 m 167.25 610.5 l 171 610.5 l 171 618 l h f* BT 1.0113 0 0 1 167.25 612.75 Tm 0 0 0 rg (1) Tj 1 1 1 rg ET 226.5 621 m 248.25 632.25 l 270 621 l 248.25 610.5 l 226.5 621 l h b* BT 1.0085 0 0 1 235.5 619.5 Tm 0 0 0 rg /F5 5.9874 Tf -0.0579 Tc -0.1187 Tw (Branch C) Tj 1 1 1 rg ET 201.75 583.5 m 237.75 583.5 l 237.75 599.25 l 201.75 599.25 l 201.75 583.5 l h b* BT 1.0085 0 0 1 209.25 589.5 Tm 0 0 0 rg -0.1721 Tc -0.0065 Tw (Block 5) Tj 1 1 1 rg ET 259.5 583.5 m 295.5 583.5 l 295.5 599.25 l 259.5 599.25 l 259.5 583.5 l h b* BT 1.0085 0 0 1 267 589.5 Tm 0 0 0 rg (Block 6) Tj ET 226.5 621 m 219.75 621 l 219.75 603 l S 221.25 603.75 m 219.75 599.25 l 217.5 603.75 l 221.25 603.75 l h f* 1 1 1 rg 217.5 618 m 217.5 609.75 l 222 609.75 l 222 618 l h f* BT 1.0113 0 0 1 218.25 612 Tm 0 0 0 rg /F4 5.9874 Tf -0.3585 Tc 0 Tw (0) Tj ET 270 621 m 277.5 621 l 277.5 603 l S 279 603.75 m 277.5 599.25 l 275.25 603.75 l 279 603.75 l h f* 1 1 1 rg 275.25 618 m 275.25 610.5 l 279.75 610.5 l 279.75 618 l h f* BT 1.0113 0 0 1 276 612.75 Tm 0 0 0 rg (1) Tj ET 140.25 645.75 m 140.25 635.25 l S 138 636 m 140.25 632.25 l 141.75 636 l 138 636 l h f* 248.25 648 m 248.25 635.25 l S 246.75 636 m 248.25 632.25 l 250.5 636 l 246.75 636 l h f* 1 1 1 rg 176.25 545.25 m 212.25 545.25 l 212.25 561.75 l 176.25 561.75 l 176.25 545.25 l h b* BT 1.0085 0 0 1 183.75 552 Tm 0 0 0 rg /F5 5.9874 Tf -0.1721 Tc -0.0065 Tw (Block 7) Tj ET 111 583.5 m 191.25 562.5 l S 191.25 564.75 m 194.25 561.75 l 189.75 561 l 191.25 564.75 l h f* 168.75 583.5 m 192 564 l S 192.75 565.5 m 194.25 561.75 l 189.75 562.5 l 192.75 565.5 l h f* 219.75 583.5 m 196.5 564 l S 195.75 565.5 m 194.25 561.75 l 198.75 562.5 l 195.75 565.5 l h f* 277.5 583.5 m 197.25 562.5 l S 197.25 564.75 m 194.25 561.75 l 198.75 561 l 197.25 564.75 l h f* 1 1 1 rg 172.5 518.25 m 194.25 529.5 l 216 518.25 l 194.25 507.75 l 172.5 518.25 l h b* BT 1.0085 0 0 1 180.75 516.75 Tm 0 0 0 rg -0.0579 Tc -0.1187 Tw (Branch D) Tj ET 194.25 545.25 m 194.25 532.5 l S 192 533.25 m 194.25 529.5 l 196.5 533.25 l 192 533.25 l h f* 172.5 518.25 m 165 518.25 l 165 497.25 l S 167.25 498 m 165.75 494.25 l 163.5 498 l 167.25 498 l h f* 1 1 1 rg 163.5 514.5 m 163.5 506.25 l 167.25 506.25 l 167.25 514.5 l h f* BT 1.0113 0 0 1 163.5 508.5 Tm 0 0 0 rg /F4 5.9874 Tf -0.3585 Tc 0 Tw (0) Tj ET 216 518.25 m 223.5 518.25 l 223.5 497.25 l S 225 498 m 223.5 494.25 l 221.25 498 l 225 498 l h f* 1 1 1 rg 221.25 513.75 m 221.25 505.5 l 225 505.5 l 225 513.75 l h f* BT 1.0113 0 0 1 221.25 507.75 Tm 0 0 0 rg (1) Tj 1 1 1 rg ET 147 477.75 m 183.75 477.75 l 183.75 494.25 l 147 494.25 l 147 477.75 l h b* BT 1.0085 0 0 1 154.5 483.75 Tm 0 0 0 rg /F5 5.9874 Tf -0.1721 Tc -0.0065 Tw (Block 8) Tj 1 1 1 rg ET 204.75 477.75 m 241.5 477.75 l 241.5 494.25 l 204.75 494.25 l 204.75 477.75 l h b* BT 1.0085 0 0 1 212.25 483.75 Tm 0 0 0 rg (Block 9) Tj ET BT 296.25 477 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg ET 396 684.75 m 417.75 695.25 l 438.75 684.75 l 417.75 673.5 l 396 684.75 l h b* BT 1.0023 0 0 1 404.25 682.5 Tm 0 0 0 rg /F5 5.9138 Tf 0.0071 Tc -0.1542 Tw (Branch A) Tj 1 1 1 rg ET 346.5 647.25 m 381.75 647.25 l 381.75 663 l 346.5 663 l 346.5 647.25 l h b* BT 1.0023 0 0 1 354 653.25 Tm 0 0 0 rg -0.1138 Tc -0.0338 Tw (Block 1) Tj 1 1 1 rg ET 453 649.5 m 488.25 649.5 l 488.25 666 l 453 666 l 453 649.5 l h b* BT 1.0023 0 0 1 460.5 656.25 Tm 0 0 0 rg (Block 2) Tj ET 396 684.75 m 364.5 684.75 l 364.5 666.75 l S 366 666.75 m 364.5 663 l 362.25 666.75 l 366 666.75 l h f* 1 1 1 rg 367.5 687.75 m 367.5 681 l 371.25 681 l 371.25 687.75 l h f* BT 1.0096 0 0 1 368.25 683.25 Tm 0 0 0 rg /F4 4.7737 Tf 0.3144 Tc 0 Tw (0) Tj ET 438.75 684.75 m 470.25 684.75 l 470.25 669 l S 472.5 669.75 m 470.25 666 l 468.75 669.75 l 472.5 669.75 l h f* 1 1 1 rg 462 687.75 m 462 681 l 465 681 l 465 687.75 l h f* BT 1.0096 0 0 1 462 683.25 Tm 0 0 0 rg (1) Tj ET 417.75 711 m 417.75 698.25 l S 415.5 699 m 417.75 695.25 l 419.25 699 l 415.5 699 l h f* 1 1 1 rg 342.75 623.25 m 364.5 633.75 l 385.5 623.25 l 364.5 612 l 342.75 623.25 l h b* BT 1.0023 0 0 1 351 621 Tm 0 0 0 rg /F5 5.9138 Tf 0.0071 Tc -0.1542 Tw (Branch B) Tj 1 1 1 rg ET 318 585.75 m 353.25 585.75 l 353.25 601.5 l 318 601.5 l 318 585.75 l h b* BT 1.0023 0 0 1 325.5 591.75 Tm 0 0 0 rg -0.1138 Tc -0.0338 Tw (Block 3) Tj 1 1 1 rg ET 375 585.75 m 410.25 585.75 l 410.25 601.5 l 375 601.5 l 375 585.75 l h b* BT 1.0023 0 0 1 382.5 591.75 Tm 0 0 0 rg (Block 4) Tj ET 342.75 623.25 m 336 623.25 l 336 605.25 l S 337.5 605.25 m 336 601.5 l 333.75 605.25 l 337.5 605.25 l h f* 1 1 1 rg 333.75 619.5 m 333.75 612.75 l 337.5 612.75 l 337.5 619.5 l h f* BT 1.0096 0 0 1 334.5 614.25 Tm 0 0 0 rg /F4 4.7737 Tf 0.3144 Tc 0 Tw (0) Tj ET 385.5 623.25 m 392.25 623.25 l 392.25 605.25 l S 394.5 605.25 m 392.25 601.5 l 390.75 605.25 l 394.5 605.25 l h f* 1 1 1 rg 390.75 619.5 m 390.75 612.75 l 394.5 612.75 l 394.5 619.5 l h f* BT 1.0096 0 0 1 391.5 615 Tm 0 0 0 rg (1) Tj 1 1 1 rg ET 449.25 623.25 m 470.25 633.75 l 492 623.25 l 470.25 612 l 449.25 623.25 l h b* BT 1.0023 0 0 1 457.5 621 Tm 0 0 0 rg /F5 5.9138 Tf 0.0071 Tc -0.1542 Tw (Branch C) Tj 1 1 1 rg ET 424.5 585.75 m 459.75 585.75 l 459.75 601.5 l 424.5 601.5 l 424.5 585.75 l h b* BT 1.0023 0 0 1 432 591.75 Tm 0 0 0 rg -0.1138 Tc -0.0338 Tw (Block 5) Tj 1 1 1 rg ET 481.5 585.75 m 516.75 585.75 l 516.75 601.5 l 481.5 601.5 l 481.5 585.75 l h b* BT 1.0023 0 0 1 488.25 591.75 Tm 0 0 0 rg (Block 6) Tj ET 449.25 623.25 m 442.5 623.25 l 442.5 605.25 l S 444 605.25 m 442.5 601.5 l 440.25 605.25 l 444 605.25 l h f* 1 1 1 rg 440.25 619.5 m 440.25 612.75 l 444 612.75 l 444 619.5 l h f* BT 1.0096 0 0 1 441 614.25 Tm 0 0 0 rg /F4 4.7737 Tf 0.3144 Tc 0 Tw (0) Tj ET 492 623.25 m 498.75 623.25 l 498.75 605.25 l S 501 605.25 m 498.75 601.5 l 497.25 605.25 l 501 605.25 l h f* 1 1 1 rg 497.25 619.5 m 497.25 612.75 l 501 612.75 l 501 619.5 l h f* BT 1.0096 0 0 1 497.25 615 Tm 0 0 0 rg (1) Tj ET 364.5 647.25 m 364.5 637.5 l S 362.25 637.5 m 364.5 633.75 l 366 637.5 l 362.25 637.5 l h f* 470.25 649.5 m 470.25 637.5 l S 468.75 637.5 m 470.25 633.75 l 472.5 637.5 l 468.75 637.5 l h f* 1 1 1 rg 346.5 553.5 m 381.75 553.5 l 381.75 569.25 l 346.5 569.25 l 346.5 553.5 l h b* BT 1.0023 0 0 1 351 559.5 Tm 0 0 0 rg /F5 5.9138 Tf 0.0055 Tc -0.1526 Tw (Block 7-1) Tj ET 336 585.75 m 361.5 571.5 l S 361.5 573 m 364.5 569.25 l 360 570 l 361.5 573 l h f* 392.25 585.75 m 366.75 571.5 l S 366.75 573 m 364.5 569.25 l 368.25 570 l 366.75 573 l h f* 1 1 1 rg 342.75 529.5 m 364.5 540 l 385.5 529.5 l 364.5 519 l 342.75 529.5 l h b* BT 1.0023 0 0 1 349.5 528 Tm 0 0 0 rg 0.0628 Tc -0.2096 Tw (Branch D1) Tj ET 364.5 553.5 m 364.5 543.75 l S 362.25 543.75 m 364.5 540 l 366 543.75 l 362.25 543.75 l h f* 342.75 529.5 m 335.25 529.5 l 335.25 510.75 l 396 510.75 l 396 506.25 l S 398.25 507 m 396 502.5 l 394.5 507 l 398.25 507 l h f* 1 1 1 rg 354.75 513.75 m 354.75 507.75 l 357.75 507.75 l 357.75 513.75 l h f* BT 1.0096 0 0 1 354.75 509.25 Tm 0 0 0 rg /F4 4.7737 Tf 0.3144 Tc 0 Tw (0) Tj 1 1 1 rg ET 378.75 486.75 m 414 486.75 l 414 502.5 l 378.75 502.5 l 378.75 486.75 l h b* BT 1.0023 0 0 1 385.5 492.75 Tm 0 0 0 rg /F5 5.9138 Tf -0.1138 Tc -0.0338 Tw (Block 8) Tj 1 1 1 rg ET 428.25 486.75 m 463.5 486.75 l 463.5 502.5 l 428.25 502.5 l 428.25 486.75 l h b* BT 1.0023 0 0 1 435 492.75 Tm 0 0 0 rg (Block 9) Tj 1 1 1 rg ET 453 553.5 m 488.25 553.5 l 488.25 569.25 l 453 569.25 l 453 553.5 l h b* BT 1.0023 0 0 1 457.5 559.5 Tm 0 0 0 rg 0.0055 Tc -0.1526 Tw (Block 7-2) Tj ET 442.5 585.75 m 468 571.5 l S 468 573 m 470.25 569.25 l 466.5 570 l 468 573 l h f* 498.75 585.75 m 473.25 571.5 l S 473.25 573 m 470.25 569.25 l 474.75 570 l 473.25 573 l h f* 1 1 1 rg 449.25 532.5 m 470.25 543 l 492 532.5 l 470.25 521.25 l 449.25 532.5 l h b* BT 1.0023 0 0 1 456 530.25 Tm 0 0 0 rg 0.0628 Tc -0.2096 Tw (Branch D2) Tj ET 470.25 553.5 m 470.25 546 l S 468.75 546.75 m 470.25 543 l 472.5 546.75 l 468.75 546.75 l h f* 428.25 521.25 m 399 504.75 l S 398.25 506.25 m 396 502.5 l 400.5 503.25 l 398.25 506.25 l h f* 492 532.5 m 499.5 532.5 l 499.5 510.75 l 445.5 510.75 l 445.5 506.25 l S 447.75 507 m 445.5 502.5 l 444 507 l 447.75 507 l h f* 1 1 1 rg 481.5 513.75 m 481.5 507.75 l 485.25 507.75 l 485.25 513.75 l h f* BT 1.0096 0 0 1 482.25 509.25 Tm 0 0 0 rg /F4 4.7737 Tf 0.3144 Tc 0 Tw (1) Tj ET 385.5 529.5 m 409.5 519 l S 1 1 1 rg 396 527.25 m 396 521.25 l 399 521.25 l 399 527.25 l h f* BT 1.0096 0 0 1 396 522.75 Tm 0 0 0 rg (1) Tj ET 409.5 519 m 442.5 504.75 l S 443.25 506.25 m 446.25 503.25 l 441.75 503.25 l 443.25 506.25 l h f* 449.25 532.5 m 428.25 521.25 l S 1 1 1 rg 437.25 530.25 m 437.25 523.5 l 440.25 523.5 l 440.25 530.25 l h f* BT 1.0096 0 0 1 437.25 525.75 Tm 0 0 0 rg (0) Tj ET BT 517.5 486 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj ET 81.75 719.25 0.75 0.75 re f 81.75 719.25 0.75 0.75 re f 82.5 719.25 223.5 0.75 re f 306 719.25 0.75 0.75 re f 306.75 719.25 222 0.75 re f 528.75 719.25 0.75 0.75 re f 528.75 719.25 0.75 0.75 re f 81.75 464.25 0.75 255 re f 306 464.25 0.75 255 re f 528.75 464.25 0.75 255 re f BT 188.25 447.75 TD /F3 11.25 Tf -0.3725 Tc 0 Tw (\(a\)) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 210 0 TD -0.0825 Tc 0 Tw (\(b\)) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 463.5 0.75 0.75 re f 81.75 463.5 0.75 0.75 re f 82.5 463.5 223.5 0.75 re f 306 463.5 0.75 0.75 re f 306.75 463.5 222 0.75 re f 528.75 463.5 0.75 0.75 re f 528.75 463.5 0.75 0.75 re f BT 177 423 TD -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.1089 Tc 0.1089 Tw ( Original \(a\) and optimized \(b\) code structure: ) Tj 221.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -345.75 -25.5 TD -0.1377 Tc 0.2181 Tw (Branch D depends on two previously executed branches, and the branch predictor is global ) Tj 166.5 -25.5 TD -0.1288 Tc 0.3163 Tw (with one history bit.) Tj 94.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -265.5 -32.25 TD /F0 12 Tf 0 Tw ( ) Tj 21.75 -12.75 TD /F3 11.25 Tf 0.0487 Tc 0.8888 Tw (Table ) Tj 30 0 TD 0.375 Tc 0 Tw (1) Tj 6 0 TD -0.0619 Tc -0.1256 Tw ( Branch outcome scenarios ) Tj 129 0 TD -0.0968 Tc -0.0371 Tw (for original \(a\) and optimized code structure \(b\).) Tj 228 0 TD 0 Tc 0.1875 Tw ( ) Tj -395.25 -25.5 TD -0.142 Tc 0.2545 Tw (In the branch predictor with one global history bit, a two) Tj 268.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1509 Tc -0.0366 Tw (bit saturation counter with ) Tj -272.25 -24.75 TD -0.1195 Tc 0.057 Tw (starting state WT \(Weakly Taken\) mispredicts both shaded outcomes and with other ) Tj 105 -25.5 TD -0.0871 Tc -0.0254 Tw (starting states, mispredicts one of t) Tj 163.5 0 TD -0.2334 Tc 0 Tw (hem.) Tj 23.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -285 -30.75 TD /F0 11.25 Tf ( ) Tj 57.75 -0.75 TD /F3 11.25 Tf -0.3725 Tc 0 Tw (\(a\)) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 57.75 0.75 TD /F0 11.25 Tf ( ) Tj 63 0 TD ( ) Tj 63.75 0 TD ( ) Tj 57 -0.75 TD /F3 11.25 Tf -0.0825 Tc 0 Tw (\(b\)) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 57.75 0.75 TD /F0 11.25 Tf ( ) Tj -405 -27 TD -0.0161 Tc 0.2036 Tw (Branch A) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 15.75 0 TD -0.0417 Tc 0.2292 Tw (Branch B/C) Tj 53.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 15.75 0 TD -0.0161 Tc 0.2036 Tw (Branch D) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 41.25 0 TD ( ) Tj 42 0 TD -0.0161 Tc 0.2036 Tw (Branch A) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 21 0 TD -0.0348 Tc 0.2223 Tw (Branch B) Tj 42.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 18.75 0 TD 0.0328 Tc 0.1547 Tw (Branch D1) Tj 48.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 203.25 0.75 0.75 re f 81.75 203.25 0.75 0.75 re f 82.5 203.25 63 0.75 re f 145.5 203.25 0.75 0.75 re f 146.25 203.25 63.75 0.75 re f 210 203.25 0.75 0.75 re f 210.75 203.25 63 0.75 re f 273.75 203.25 0.75 0.75 re f 273.75 203.25 0.75 0.75 re f 336.75 203.25 0.75 0.75 re f 336.75 203.25 0.75 0.75 re f 337.5 203.25 63 0.75 re f 400.5 203.25 0.75 0.75 re f 401.25 203.25 63 0.75 re f 464.25 203.25 0.75 0.75 re f 465 203.25 63.75 0.75 re f 528.75 203.25 0.75 0.75 re f 528.75 203.25 0.75 0.75 re f 81.75 184.5 0.75 18.75 re f 145.5 184.5 0.75 18.75 re f 210 184.5 0.75 18.75 re f 273.75 184.5 0.75 18.75 re f 336.75 184.5 0.75 18.75 re f 400.5 184.5 0.75 18.75 re f 464.25 184.5 0.75 18.75 re f 528.75 184.5 0.75 18.75 re f BT 111.75 174 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 60 0 TD ( ) Tj 61.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 183.75 0.75 0.75 re f 82.5 183.75 63 0.75 re f 145.5 183.75 0.75 0.75 re f 146.25 183.75 63.75 0.75 re f 210 183.75 0.75 0.75 re f 210.75 183.75 63 0.75 re f 273.75 183.75 0.75 0.75 re f 336.75 183.75 0.75 0.75 re f 337.5 183.75 63 0.75 re f 400.5 183.75 0.75 0.75 re f 401.25 183.75 63 0.75 re f 464.25 183.75 0.75 0.75 re f 465 183.75 63.75 0.75 re f 528.75 183.75 0.75 0.75 re f 81.75 165 0.75 18.75 re f 145.5 165 0.75 18.75 re f 210 165 0.75 18.75 re f 273.75 165 0.75 18.75 re f 336.75 165 0.75 18.75 re f 400.5 165 0.75 18.75 re f 464.25 165 0.75 18.75 re f 528.75 165 0.75 18.75 re f BT 111.75 154.5 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 0.8745 0.8745 0.8745 rg 216 145.5 53.25 18.75 re f BT 240 154.5 TD 0 0 0 rg 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 210.75 164.25 m 210.75 145.5 l 216 145.5 l 216 164.25 l 269.25 164.25 m 269.25 145.5 l 273.75 145.5 l 273.75 164.25 l h W* n 0.8745 0.8745 0.8745 rg 210.75 145.5 63 18.75 re f Q BT 305.25 154.5 TD ( ) Tj 61.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 164.25 0.75 0.75 re f 82.5 164.25 63 0.75 re f 145.5 164.25 0.75 0.75 re f 146.25 164.25 63.75 0.75 re f 210 164.25 0.75 0.75 re f 210.75 164.25 63 0.75 re f 273.75 164.25 0.75 0.75 re f 336.75 164.25 0.75 0.75 re f 337.5 164.25 63 0.75 re f 400.5 164.25 0.75 0.75 re f 401.25 164.25 63 0.75 re f 464.25 164.25 0.75 0.75 re f 465 164.25 63.75 0.75 re f 528.75 164.25 0.75 0.75 re f 81.75 145.5 0.75 18.75 re f 145.5 145.5 0.75 18.75 re f 210 145.5 0.75 18.75 re f 273.75 145.5 0.75 18.75 re f 336.75 145.5 0.75 18.75 re f 400.5 145.5 0.75 18.75 re f 464.25 145.5 0.75 18.75 re f 528.75 145.5 0.75 18.75 re f BT 111.75 135 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 60 0 TD ( ) Tj 42 0 TD -0.0161 Tc 0.2036 Tw (Branch A) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 21 0 TD -0.0348 Tc 0.2223 Tw (Branch C) Tj 42.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 18.75 0 TD 0.0328 Tc 0.1547 Tw (Branch D2) Tj 48.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 144.75 0.75 0.75 re f 82.5 144.75 63 0.75 re f 145.5 144.75 0.75 0.75 re f 146.25 144.75 63.75 0.75 re f 210 144.75 0.75 0.75 re f 210.75 144.75 63 0.75 re f 273.75 144.75 0.75 0.75 re f 336.75 144.75 0.75 0.75 re f 337.5 144.75 63 0.75 re f 400.5 144.75 0.75 0.75 re f 401.25 144.75 63 0.75 re f 464.25 144.75 0.75 0.75 re f 465 144.75 63.75 0.75 re f 528.75 144.75 0.75 0.75 re f 81.75 126.75 0.75 18 re f 145.5 126.75 0.75 18 re f 210 126.75 0.75 18 re f 273.75 126.75 0.75 18 re f 336.75 126.75 0.75 18 re f 400.5 126.75 0.75 18 re f 464.25 126.75 0.75 18 re f 528.75 126.75 0.75 18 re f BT 111.75 116.25 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 0.8745 0.8745 0.8745 rg 216 106.5 53.25 19.5 re f BT 240 116.25 TD 0 0 0 rg 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 210.75 126 m 210.75 107.25 l 216 107.25 l 216 126 l 269.25 126 m 269.25 107.25 l 273.75 107.25 l 273.75 126 l h W* n 0.8745 0.8745 0.8745 rg 210.75 107.25 63 18.75 re f Q BT 305.25 116.25 TD ( ) Tj 61.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (0) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 126 0.75 0.75 re f 82.5 126 63 0.75 re f 145.5 126 0.75 0.75 re f 146.25 126 63.75 0.75 re f 210 126 0.75 0.75 re f 210.75 126 63 0.75 re f 273.75 126 0.75 0.75 re f 336.75 126 0.75 0.75 re f 337.5 126 63 0.75 re f 400.5 126 0.75 0.75 re f 401.25 126 63 0.75 re f 464.25 126 0.75 0.75 re f 465 126 63.75 0.75 re f 528.75 126 0.75 0.75 re f 81.75 107.25 0.75 18.75 re f 145.5 107.25 0.75 18.75 re f 210 107.25 0.75 18.75 re f 273.75 107.25 0.75 18.75 re f 336.75 107.25 0.75 18.75 re f 400.5 107.25 0.75 18.75 re f 464.25 107.25 0.75 18.75 re f 528.75 107.25 0.75 18.75 re f BT 114 96.75 TD ( ) Tj 63.75 0 TD ( ) Tj 64.5 0 TD ( ) Tj 63 0 TD ( ) Tj 61.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 58.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 59.25 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 106.5 0.75 0.75 re f 81.75 106.5 0.75 0.75 re f 82.5 106.5 63 0.75 re f 145.5 106.5 0.75 0.75 re f 146.25 106.5 63.75 0.75 re f 210 106.5 0.75 0.75 re f 210.75 106.5 63 0.75 re f 273.75 106.5 0.75 0.75 re f 273.75 106.5 0.75 0.75 re f 336.75 106.5 0.75 0.75 re f 337.5 106.5 63 0.75 re f 400.5 106.5 0.75 0.75 re f 401.25 106.5 63 0.75 re f 464.25 106.5 0.75 0.75 re f 465 106.5 63.75 0.75 re f 528.75 106.5 0.75 0.75 re f 336.75 87.75 0.75 18.75 re f 336.75 87 0.75 0.75 re f 336.75 87 0.75 0.75 re f 337.5 87 63 0.75 re f 400.5 87.75 0.75 18.75 re f 400.5 87 0.75 0.75 re f 401.25 87 63 0.75 re f 464.25 87.75 0.75 18.75 re f 464.25 87 0.75 0.75 re f 465 87 63.75 0.75 re f 528.75 87.75 0.75 18.75 re f 528.75 87 0.75 0.75 re f 528.75 87 0.75 0.75 re f BT 99.75 78 TD ( ) Tj ET endstream endobj 44 0 obj 20887 endobj 42 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R /F4 26 0 R /F5 30 0 R >> /ProcSet 2 0 R >> /Contents 43 0 R >> endobj 46 0 obj << /Length 47 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1317 Tc 0.3692 Tw (A similar optimization can be done for processors with a local branch predictor. Let us ) Tj 387 0 TD -0.0769 Tc 1.0144 Tw (consider ) Tj -399 -25.5 TD -0.0956 Tc 0.7519 Tw (a processor with a local branch predictor using ) Tj 216 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.1323 Tc 1.1448 Tw ( bits of local branch history. The outcome of a ) Tj -223.5 -24.75 TD -0.1078 Tc 0.4453 Tw (loop condition branch can be correctly predicted if the loop does not have more than ) Tj 381 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 8.25 0 TD /F0 11.25 Tf -0.1077 Tc 0.6702 Tw ( iterations. ) Tj -389.25 -25.5 TD -0.1687 Tc 1.4813 Tw (Loops with more than ) Tj 106.5 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.0798 Tc 1.4673 Tw ( iterations can be unrolled, ) Tj 129.75 0 TD -0.0474 Tc 1.4135 Tw (so the existing predictor can predict each ) Tj -243.75 -25.5 TD -0.1454 Tc 1.2235 Tw (unrolled loop. The compiler should perform loop unrolling if one such loop belongs to a critical ) Tj 0 -25.5 TD -0.0995 Tc 1.787 Tw (portion of the code that executes frequently, and if it should be unrolled relatively few times. ) Tj 0 -24.75 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 33 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD -0.1429 Tc 1.6429 Tw (a shows an example of code where the inner loop executes eight times and then exits, ) Tj -38.25 -25.5 TD -0.068 Tc 1.0889 Tw (while the outer loop executes 1 million times. If this code executes on a processor with a local ) Tj 0 -25.5 TD -0.1347 Tc 1.8222 Tw (branch predictor using four bits of local history and two) Tj 259.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.0997 Tc 1.5878 Tw (bit sa) Tj 26.25 0 TD -0.1309 Tc 1.5184 Tw (turation counters, the inner loop ) Tj -289.5 -25.5 TD -0.095 Tc 0.4968 Tw (condition branch is mispredicted once in nine times, thus having 1 million branch mispredictions. ) Tj 0 -24.75 TD -0.0902 Tc 1.3277 Tw (After eight loop iterations, the two) Tj 158.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0556 Tc 1.0681 Tw (bit counter for the history 1111 will be in the ) Tj 213.75 0 TD /F2 11.25 Tf -0.0453 Tc 0.9828 Tw (Strong Taken) Tj 60.75 0 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -436.5 -25.5 TD -0.0889 Tc 1.7764 Tw (state. At the loop exit, f) Tj 111 0 TD -0.1555 Tc 1.843 Tw (our bits of local history are the same as in the previous four iterations ) Tj -111 -25.5 TD -0.1359 Tc 0.2401 Tw (\(1111\); hence the exit case uses the same BPT cell as others, and cannot be predicted correctly. An ) Tj 0 -24.75 TD -0.05 Tc 0 Tw (architecture) Tj 52.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.1559 Tc 0.5309 Tw (aware compiler can unroll the inner loop, and in this example twice i) Tj 304.5 0 TD -0.0928 Tc 0.2803 Tw (s enough \() Tj 46.5 0 TD -0.0206 Tc 0.9581 Tw (Figure ) Tj -407.25 -25.5 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD -0.0758 Tc 0.3957 Tw (b\). Both new loop branches can be correctly predicted with four bits of local history: now the 4) Tj 426.75 0 TD 0.0038 Tc 0 Tw (-) Tj -432 -25.5 TD -0.1402 Tc 0.3652 Tw (bit branch history at the exit is unique with pattern 1111, so the exit case is mapped to the separate ) Tj 0 -25.5 TD -0.0755 Tc 1.763 Tw (BPT entry. The nu) Tj 87.75 0 TD -0.0685 Tc 1.6878 Tw (mber of lost execution cycles due to mispredicted branches is significantly ) Tj -87.75 -24.75 TD -0.0211 Tc 0.2086 Tw (reduced. ) Tj 41.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 47 0 obj 3633 endobj 45 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R >> /ProcSet 2 0 R >> /Contents 46 0 R >> endobj 51 0 obj << /Length 52 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F6 9.75 Tf 0.15 Tc (for \(i=0;i<1000000;++i\){) Tj 144 0 TD 0 Tc 0.15 Tw ( ) Tj -144 -11.25 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -18 -11.25 TD 0.15 Tc 0 Tw ( //original inner loop) Tj 132 0 TD 0 Tc 0.15 Tw ( ) Tj -132 -11.25 TD 0.15 Tc 0 Tw ( for \(j=0;j<8;++j\){ ) Tj 120 0 TD 0 Tc 0.15 Tw ( ) Tj -120 -11.25 TD 0.15 Tc 0 Tw ( ...) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj -24 -12 TD 0.15 Tc 0 Tw ( }) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj -12 -11.25 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -18 -11.25 TD 0.15 Tc 0 Tw (}) Tj 6 0 TD 0 Tc 0.15 Tw ( ) Tj 218.25 79.5 TD 0.15 Tc 0 Tw (for \(i=0;i<100000;++i\){) Tj 138 0 TD 0 Tc 0.15 Tw ( ) Tj -138 -11.25 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -18 -11.25 TD 0.15 Tc 0 Tw ( //inner loop unrolled twice) Tj 168 0 TD 0 Tc 0.15 Tw ( ) Tj -168 -11.25 TD 0.15 Tc 0 Tw ( for \(j=0;j<4;++j\){ ...) Tj 138 0 TD 0 Tc 0.15 Tw ( ) Tj -138 -11.25 TD 0.15 Tc 0 Tw ( .) Tj 12 0 TD (..) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj -24 -12 TD 0.15 Tc 0 Tw ( }) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj -12 -11.25 TD 0.15 Tc 0 Tw ( for \(j=0;j<4;++j\){ ) Tj 120 0 TD 0 Tc 0.15 Tw ( ) Tj -120 -11.25 TD 0.15 Tc 0 Tw ( ...) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj -24 -11.25 TD 0.15 Tc 0 Tw ( }) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj -12 -11.25 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -18 -11.25 TD 0.15 Tc 0 Tw (}) Tj 6 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 719.25 0.75 0.75 re f 81.75 719.25 0.75 0.75 re f 82.5 719.25 223.5 0.75 re f 306 719.25 0.75 0.75 re f 306.75 719.25 222 0.75 re f 528.75 719.25 0.75 0.75 re f 528.75 719.25 0.75 0.75 re f 81.75 594.75 0.75 124.5 re f 306 594.75 0.75 124.5 re f 528.75 594.75 0.75 124.5 re f BT 187.5 578.25 TD /F3 11.25 Tf -0.3725 Tc 0 Tw (\(a\)) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 210.75 0 TD -0.0825 Tc 0 Tw (\(b\)) Tj 13.5 0 TD /F6 9 Tf 0 Tc -0.15 Tw ( ) Tj ET 81.75 594 0.75 0.75 re f 81.75 594 0.75 0.75 re f 82.5 594 223.5 0.75 re f 306 594 0.75 0.75 re f 306.75 594 222 0.75 re f 528.75 594 0.75 0.75 re f 528.75 594 0.75 0.75 re f BT 194.25 553.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD -0.1288 Tc 0.2225 Tw ( Original \(a\) and optimized \(b\) C code: ) Tj 186 0 TD 0 Tc 0.1875 Tw ( ) Tj -255 -25.5 TD -0.1521 Tc 0.1521 Tw (Original inner loop depends on its eight previous executions) Tj 281.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -346.5 -30.75 TD /F0 11.25 Tf -0.1586 Tc 0.3461 Tw (If the compiler is aware of predictor size and organization \(i.e., the) Tj 294 0 TD -0.1352 Tc 0.1084 Tw ( number of ways and sets and ) Tj -306 -24.75 TD -0.0932 Tc 0.4131 Tw (function used for the index\), it can prevent branch interference in the critical portions of the code. ) Tj 0 -25.5 TD -0.1431 Tc 1.1306 Tw (For example, inserting a required number of noop instructions in the code can separate branches ) Tj T* -0.0916 Tc 0.2791 Tw (that map to the same predict) Tj 125.25 0 TD -0.1003 Tc 0.2878 Tw (or entry. ) Tj 40.5 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 32.25 0 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD -0.1115 Tc 0.449 Tw (a shows an example of two branches mapping to the ) Tj -203.25 -25.5 TD -0.1544 Tc 0.3003 Tw (same branch target buffer entry, where it is assumed that branches at a distance of 512 bytes access ) Tj 0 -24.75 TD -0.1056 Tc 1.0431 Tw (the same BTB cell. If the BTB is always updated, both branch targe) Tj 311.25 0 TD -0.1377 Tc 1.2252 Tw (ts will be mispredicted, one ) Tj -311.25 -25.5 TD -0.1505 Tc 0.2939 Tw (always replacing the other. If both branches belong to a frequently executed portion of the code, an ) Tj 0 -25.5 TD -0.05 Tc 0 Tw (architecture) Tj 52.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.1285 Tc 3.5035 Tw (aware compiler can insert a) Tj 0 Tc 0.1875 Tw ( ) Tj 140.25 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (noop) Tj 23.25 0 TD /F0 11.25 Tf -0.119 Tc 3.2128 Tw ( instruction before one of the branches, thus ) Tj -219.75 -25.5 TD -0.1239 Tc 0.3114 Tw (preventing interference in the BTB ) Tj 157.5 0 TD 0.0038 Tc 0 Tw (\() Tj 3.75 0 TD -0.2706 Tc 0.4581 Tw (Figure ) Tj 30.75 0 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD -0.0612 Tc 0.2487 Tw (b\). ) Tj 15 0 TD 0 Tc 0.1875 Tw ( ) Tj -212.25 -24.75 TD /F6 9.75 Tf 0.15 Tw ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -11.25 TD 0.15 Tc 0 Tw (addr512: ) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD 0.15 Tc 0 Tw (jle l1) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj -108 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -12 TD 0.15 Tc 0 Tw (l1:) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD ( ) Tj -108 -11.25 TD 0.15 Tc 0 Tw (addr1024: ) Tj 60 0 TD 0 Tc 0.15 Tw ( ) Tj 12 0 TD 0.15 Tc 0 Tw (jle l2) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj -108 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 132.75 68.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -11.25 TD 0.15 Tc 0 Tw (addr512: ) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD 0.15 Tc 0 Tw (jle l1) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj -108 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -12 TD 0.15 Tc 0 Tw (l1:) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj -90 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj 12 0 TD ( ) Tj -108 -11.25 TD 0.15 Tc 0 Tw (addr1025: ) Tj 60 0 TD 0 Tc 0.15 Tw ( ) Tj 12 0 TD 0.15 Tc 0 Tw (jle l2) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj -108 -11.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 278.25 0.75 0.75 re f 81.75 278.25 0.75 0.75 re f 82.5 278.25 222 0.75 re f 304.5 278.25 0.75 0.75 re f 305.25 278.25 223.5 0.75 re f 528.75 278.25 0.75 0.75 re f 528.75 278.25 0.75 0.75 re f 81.75 187.5 0.75 90.75 re f 304.5 187.5 0.75 90.75 re f 528.75 187.5 0.75 90.75 re f BT 187.5 171 TD /F3 11.25 Tf -0.3725 Tc 0 Tw (\(a\)) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 210 0 TD -0.0825 Tc 0 Tw (\(b\)) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 81.75 186.75 0.75 0.75 re f 81.75 186.75 0.75 0.75 re f 82.5 186.75 222 0.75 re f 304.5 186.75 0.75 0.75 re f 305.25 186.75 223.5 0.75 re f 528.75 186.75 0.75 0.75 re f 528.75 186.75 0.75 0.75 re f BT 177 146.25 TD -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD -0.1114 Tc 0.1739 Tw ( Original \(a\) and optimized \(b\) a) Tj 151.5 0 TD 0 Tc -0.1881 Tw (ssembly code: ) Tj 69 0 TD 0 Tc 0.1875 Tw ( ) Tj -242.25 -25.5 TD -0.1068 Tc 0.08 Tw (Branches mapping to the same predictor entry. ) Tj 225 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 52 0 obj 7315 endobj 48 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F6 49 0 R >> /ProcSet 2 0 R >> /Contents 51 0 R >> endobj 54 0 obj << /Length 55 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.0605 Tc 0.248 Tw (An architecture) Tj 69 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1115 Tc 0.299 Tw (aware compiler can encompass these mechanisms and other similar techniques. ) Tj -84.75 -25.5 TD -0.1519 Tc 4.9019 Tw (If applied to critical portions of the code, these optimizations can significantly increase) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -24.75 TD -0.1601 Tc 0 Tw (performanc) Tj 51 0 TD -0.077 Tc 1.7645 Tw (e. However, to be able to do so, the compiler needs to know the details about the ) Tj -51 -25.5 TD -0.108 Tc 0.3892 Tw (branch predictor organization: the use of global, local, or both types of branch history; the number ) Tj 0 -25.5 TD -0.1551 Tc 0.3426 Tw (of history bits; and the BTB size and organization. ) Tj 225 0 TD 0 Tc 0.1875 Tw ( ) Tj -225 -40.5 TD /F3 14.25 Tf -0.0926 Tc 0.2801 Tw (Experimental Environ) Tj 134.25 0 TD -0.2164 Tc 0 Tw (ment) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj -152.25 -33 TD /F0 11.25 Tf -0.1402 Tc 3.1277 Tw (This paper focuses on the widely used Intel P6 \(Pentium III\) and NetBurst \(Pentium 4\) ) Tj -12 -25.5 TD -0.1308 Tc 0.3183 Tw (architectures, although the proposed microbenchmarks can be applied, with some modifications, to ) Tj 0 -24.75 TD -0.1334 Tc 0.8834 Tw (other microprocessor architectures. For both P6 and NetBurst archite) Tj 309 0 TD -0.0392 Tc 0.9767 Tw (ctures, Intel sources [) Tj 97.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD 0.065 Tc 0.8725 Tw (], [) Tj 13.5 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD -0.2794 Tc -0.2831 Tw (], ) Tj -430.5 -25.5 TD 0.0038 Tc 0 Tw ([) Tj 3.75 0 TD -0.375 Tc (15) Tj 10.5 0 TD -0.1471 Tc 2.7 Tw (] do not provide the exact description of the implemented branch predictors. Rather, they) Tj 0 Tc -0.5625 Tw ( ) Tj -14.25 -25.5 TD 0.0942 Tc 2.3433 Tw (provide t) Tj 41.25 0 TD -0.0783 Tc 2.4581 Tw (he exact number of BTB entries and several hints about program optimization that ) Tj -41.25 -25.5 TD -0.0722 Tc 2.416 Tw (indicate some outcome predictor parameters. If a branch is not in the BTB, a static branch ) Tj 0 -24.75 TD -0.1544 Tc 2.4312 Tw (prediction is used, which means that the BTB and outcome predictor are coupled. The ) Tj 413.25 0 TD -0.0831 Tc 0.2706 Tw (static ) Tj -413.25 -25.5 TD -0.1228 Tc 0.9978 Tw (prediction mechanism predicts backward conditional branches as taken, and forward branches as ) Tj 0 -25.5 TD -0.1256 Tc 0.2506 Tw (not taken. A return address stack of a known size predicts return addresses.) Tj 331.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -319.5 -24.75 TD -0.0967 Tc 2.6092 Tw (The P6 optimization reference manual states that the prediction algorithm incl) Tj 367.5 0 TD -0.0102 Tc 0.9477 Tw (udes pattern ) Tj -379.5 -25.5 TD -0.1237 Tc 1.0612 Tw (matching and can track up to the last 4 branch directions per branch address [) Tj 352.5 0 TD -0.375 Tc 0 Tw (15) Tj 10.5 0 TD -0.1738 Tc 0.8613 Tw (], most probably ) Tj -363 -25.5 TD -0.0853 Tc 1.0669 Tw (meaning that the P6 branch predictor has a local history component with 4 history bits. The P6 ) Tj 0 -25.5 TD -0.2225 Tc 0.16 Tw (BTB has 512 ) Tj 60.75 0 TD -0.0694 Tc 0.2569 Tw (entries. ) Tj 35.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -84 -24.75 TD -0.1142 Tc 0.3954 Tw (In the NetBurst architecture implemented in the Pentium 4, Intel claims to use a new prediction ) Tj -12 -25.5 TD -0.07 Tc 1.0075 Tw (algorithm, 33% better than in the P6. One of the assembly/compiler coding rules for Pentium 4 ) Tj 0 -25.5 TD -0.0935 Tc 1.1382 Tw (states that frequently executed loops with predictable n) Tj 249.75 0 TD -0.1174 Tc 1.1621 Tw (umber of iterations should be unrolled to ) Tj -249.75 -25.5 TD -0.1296 Tc 1.8707 Tw (reduce the number of iterations to 16 or fewer, and if the loop has ) Tj 317.25 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf -0.0989 Tc 1.7864 Tw ( conditional branches, it ) Tj -324.75 -24.75 TD -0.1977 Tc 1.1352 Tw (should be unrolled so that the number of iterations is 16/) Tj 254.25 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 0 TD /F0 11.25 Tf 0.7538 Tc 0.1837 Tw ( [) Tj 7.5 0 TD 0.375 Tc 0 Tw (1) Tj 5.25 0 TD -0.237 Tc 1.1745 Tw (]. This rule indic) Tj 75.75 0 TD -0.1095 Tc 0.672 Tw (ates that Pentium 4 ) Tj ET endstream endobj 55 0 obj 4399 endobj 53 0 obj << /Type /Page /Parent 36 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 54 0 R >> endobj 58 0 obj << /Length 59 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.1545 Tc 4.0385 Tw (uses a global outcome history, with probably 16 history bits, but the Intel sources never) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -25.5 TD -0.1346 Tc 0.3221 Tw (specifically say so.) Tj 84 0 TD 0 Tc 0.1875 Tw ( ) Tj -72 -24.75 TD -0.1189 Tc 0.9939 Tw (Another interesting characteristic of the NetBurst architecture, tightly coupled with the branch ) Tj -12 -25.5 TD -0.1583 Tc 1.4708 Tw (prediction mechanism, is an ) Tj 132.75 0 TD 0.0525 Tc 1.635 Tw (execution trace cache [) Tj 108 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD -0.104 Tc 1.4701 Tw (], which stores and delivers sequences of ) Tj -246 -25.5 TD -0.1416 Tc 1.0791 Tw (traces, built from decoded instructions according to the execution flow. Intel sources explain that ) Tj 0 -25.5 TD -0.1227 Tc 1.0602 Tw (the trace cache and front) Tj 111.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0758 Tc 1.0133 Tw (end translation engi) Tj 87.75 0 TD -0.1162 Tc 0.8395 Tw (ne have cooperating branch prediction hardware, so ) Tj -203.25 -24.75 TD -0.1515 Tc 1.0515 Tw (branch targets can be fetched from the trace cache, or in the case of a trace cache miss, from the ) Tj 0 -25.5 TD -0.0977 Tc 1.6446 Tw (second level cache or memory. The trace cache BTB is smaller \(512 entries\) compared to the ) Tj T* -0.174 Tc 0 Tw (front) Tj 21 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.2325 Tc 2.205 Tw (end BTB \() Tj 51.75 0 TD -0.1515 Tc 2.589 Tw (4K entries\). It seems that both the trace cache and front) Tj 266.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1232 Tc 1.6233 Tw (end share the same ) Tj -346.5 -25.5 TD -0.148 Tc 0.3355 Tw (outcome predictor mechanism [) Tj 140.25 0 TD -0.375 Tc 0 Tw (15) Tj 10.5 0 TD -0.0707 Tc 0.2582 Tw (], but apart from trace cache size \(12K micro) Tj 200.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1273 Tc 0.3148 Tw (ops\), and the trace ) Tj -354.75 -24.75 TD -0.1671 Tc 1.1046 Tw (cache line size \(6 micro) Tj 106.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0035 Tc 0.941 Tw (ops\), Intel does n) Tj 78.75 0 TD -0.1811 Tc 1.3061 Tw (ot disclose many details about its implementation. This ) Tj -189 -25.5 TD -0.1692 Tc 1.1067 Tw (work considers only the front) Tj 132 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1703 Tc 0.9714 Tw (end BTB, and more experiments for trace cache component can be ) Tj -135.75 -25.5 TD -0.152 Tc 0.5895 Tw (found in reference [) Tj 87.75 0 TD -0.375 Tc 0 Tw (16) Tj 10.5 0 TD 0.0956 Tc (].) Tj 6.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -93 -24.75 TD -0.0992 Tc 0.1617 Tw (Both P6 and NetBurst architectures have s) Tj 188.25 0 TD -0.111 Tc 0.4057 Tw (everal performance counters, able to measure various ) Tj -200.25 -25.5 TD 0.0231 Tc 0 Tw (branch) Tj 30 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.1151 Tc 1.1151 Tw (related events, such as the number of retired branches, including unconditional branches, ) Tj -33.75 -25.5 TD -0.1142 Tc 2.6589 Tw (and the number of mispredicted branches, using event) Tj 255.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0472 Tc 2.2348 Tw (based sampling. Since the number of ) Tj -259.5 -25.5 TD -0.0732 Tc 0.2607 Tw (branches depends ) Tj 81.75 0 TD -0.1474 Tc 0.4599 Tw (on a particular microbenchmark and the number of times it executes, throughout ) Tj -81.75 -24.75 TD -0.1049 Tc 1.8424 Tw (the paper the MPR \(Misprediction Ratio\) is often used instead of the number of mispredicted ) Tj 0 -25.5 TD -0.1198 Tc 2.5038 Tw (branches. The MPR is the number of mispredicted branches divided by the total number ) Tj 426.75 0 TD 0.1894 Tc 0.7481 Tw (of ) Tj -426.75 -25.5 TD -0.1267 Tc 0.3142 Tw (conditional branch instructions.) Tj 138.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -126.75 -25.5 TD -0.1246 Tc 1.8121 Tw (Although event) Tj 69.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0846 Tc 1.9328 Tw (based sampling is not precise, it gives a good estimation of the number of ) Tj -85.5 -24.75 TD -0.0963 Tc 0.372 Tw (events. A performance counter is configured to count one or more types of events and to generate ) Tj 0 -25.5 TD -0.0415 Tc 2.479 Tw (an interrupt when it overflows. T) Tj 157.5 0 TD -0.1398 Tc 2.2648 Tw (he counter is preset to a modulus value that will cause the ) Tj -157.5 -25.5 TD -0.0387 Tc 0.9762 Tw (overflow after a specific number of events have been counted. In this research the Intel VTune ) Tj 0 -25.5 TD -0.1037 Tc 0.1759 Tw (Performance Analyzer version 5.0 was used for configuration and access of performance counters. ) Tj 0 -24.75 TD 0.0765 Tc 0 Tw (Perfo) Tj 24 0 TD -0.1513 Tc 0.3388 Tw (rmance counters on most non) Tj 129.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0765 Tc 0.189 Tw (Intel architectures, as well as Intel processors, can be accessed ) Tj ET endstream endobj 59 0 obj 4905 endobj 56 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font << /F0 6 0 R >> /ProcSet 2 0 R >> /Contents 58 0 R >> endobj 61 0 obj << /Length 62 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.1428 Tc 0.9553 Tw (by using the freeware PAPI tool \(Performance Application Programming Interface\), developed at ) Tj 0 -25.5 TD -0.1087 Tc 0.2962 Tw (the University of Tennessee [) Tj 130.5 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD -0.2794 Tc -0.2831 Tw (]. ) Tj 9 0 TD 0 Tc 0.1875 Tw ( ) Tj -132.75 -24.75 TD -0.1333 Tc 3.053 Tw (All test benchmarks are compiled using a Microsoft Visual Studio 6.0 C compiler, with ) Tj -12 -25.5 TD -0.0954 Tc 0.3454 Tw (disabled optimization, preventing the compiler optimizations from changing the order and number ) Tj 0 -25.5 TD -0.1363 Tc 0.5113 Tw (of conditional branches. For experiments with a relatively large) Tj 282 0 TD -0.0932 Tc 0.1736 Tw ( number of branches, we have also ) Tj -282 -25.5 TD -0.1121 Tc 1.9067 Tw (developed programs to generate benchmarks to our specifications in assembly. In order to get ) Tj 0 -24.75 TD -0.1444 Tc 3.0641 Tw (reliable values of performance counters, the execution time of the monitored code must be ) Tj 0 -25.5 TD -0.117 Tc 1.242 Tw (significantly larger than the execution) Tj 170.25 0 TD -0.1093 Tc 1.0468 Tw ( of the interrupt service routine. Therefore, the test code is ) Tj -170.25 -25.5 TD -0.1609 Tc 0.4166 Tw (placed within a loop executing a relatively large number of times. ) Tj 292.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -292.5 -40.5 TD /F3 14.25 Tf -0.169 Tc 0.3565 Tw (Experiment Flow) Tj 104.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -92.25 -33 TD /F0 11.25 Tf -0.0923 Tc 0.9762 Tw (The experiment flow consists of two groups of experiments targeting the branch target buffer ) Tj -12 -25.5 TD -0.0952 Tc 0.7827 Tw (and the outcome ) Tj 78.75 0 TD -0.0762 Tc 1.0137 Tw (predictor. Branch target buffer experiments uncover the BTB organization and ) Tj -78.75 -24.75 TD -0.0855 Tc 0.323 Tw (address bits used as an index, and outcome predictor experiments determine the existence of local ) Tj 0 -25.5 TD -0.1413 Tc 0.3288 Tw (and global prediction components, and the length of the corresponding history reg) Tj 361.5 0 TD -0.0445 Tc 0.232 Tw (isters. ) Tj 29.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -390.75 -39 TD /F1 12 Tf 0.1239 Tc -0.1239 Tw ( BTB Experiments) Tj 93 0 TD 0 Tc 0 Tw ( ) Tj -93 -19.5 TD /F0 12 Tf ( ) Tj 12 -12 TD /F0 11.25 Tf -0.1024 Tc 2.6552 Tw (The Intel documentation for P6 does not describe BTB organization: whether it is direct) Tj 420.75 0 TD 0.0038 Tc 0 Tw (-) Tj -432.75 -25.5 TD -0.0443 Tc 0.9818 Tw (mapped or set) Tj 63.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0779 Tc 1.0154 Tw (associative, and the degree of associativity. One way to determine the number of ) Tj -67.5 -25.5 TD -0.1378 Tc 0.9087 Tw (ways and the address bits used as the BTB inde) Tj 216 0 TD -0.1465 Tc 1.084 Tw (x is to run a set of microbenchmarks varying the ) Tj -216 -25.5 TD -0.1029 Tc 1.7904 Tw (address distance D between branch instructions \() Tj 224.25 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 33 0 TD 0.375 Tc 0 Tw (7) Tj 5.25 0 TD -0.1423 Tc 1.4548 Tw (\). Each microbenchmark has ) Tj 135 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 -1.5 TD /F2 6.75 Tf -0 Tc (BTB) Tj 12.75 1.5 TD /F2 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 3.75 0 TD 0.375 Tc 0 Tw (\226) Tj 6 0 TD 1.125 Tc -0.1875 Tw ( 1) Tj 9 0 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -436.5 -24.75 TD -0.1395 Tc 0.327 Tw (conditional branches in a loop, which makes a total of ) Tj 241.5 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 -1.5 TD /F2 6.75 Tf -0 Tc (BTB) Tj 12 1.5 TD /F0 11.25 Tf -0.1561 Tc 0.3436 Tw ( conditional branches, where ) Tj 129.75 0 TD /F2 11.25 Tf -0.0038 Tc 0 Tw (N) Tj 7.5 -1.5 TD /F2 6.75 Tf 0.5614 Tc (BT) Tj 8.25 0 TD -0.3743 Tc (B) Tj 3.75 1.5 TD /F0 11.25 Tf -0.2003 Tc 0.3878 Tw ( is the ) Tj -410.25 -25.5 TD -0.0745 Tc 0.262 Tw (number of BTB entries. These conditional branches are always taken, so they are mispredicted by ) Tj 0 -25.5 TD -0.1329 Tc 0.4037 Tw (the static algorithm if not present in the BTB. ) Tj 204.75 0 TD -0.2706 Tc 0.4581 Tw (Figure ) Tj 30.75 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD -0.1167 Tc 0.3042 Tw ( shows the fragment of the microbenchmark ) Tj -240.75 -25.5 TD -0.0105 Tc 0 Tw (code.) Tj 24 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 62 0 obj 4406 endobj 60 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font << /F0 6 0 R /F1 8 0 R /F2 10 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 61 0 R >> endobj 64 0 obj << /Length 65 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1302 Tc 1.4427 Tw (For a \223fitting\224 distance ) Tj 110.25 0 TD 0.1275 Tc 0 Tw (D) Tj 8.25 -1.5 TD /F0 6.75 Tf -0.003 Tc (F) Tj 3.75 1.5 TD /F0 11.25 Tf -0.1624 Tc 1.7345 Tw (, when all considered branches can fit in the BTB, the number of ) Tj -134.25 -25.5 TD -0.1211 Tc 1.2086 Tw (mispredictions is close to zero, i.e., the performance counter counts only a negligible number of ) Tj 0 -24.75 TD -0.1009 Tc 1.0384 Tw (mispredictions. If there is only one distance D) Tj 209.25 -1.5 TD /F0 6.75 Tf -0.003 Tc 0 Tw (F) Tj 3.75 1.5 TD /F0 11.25 Tf -0.1281 Tc 1.0656 Tw (, then the BTB is direct) Tj 106.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.143 Tc 1.0805 Tw (mapped, and addres) Tj 89.25 0 TD -0.0765 Tc 1.014 Tw (s bits ) Tj -412.5 -25.5 TD -0.0739 Tc 1.0114 Tw (used as the BTB index are Addr[ i+j) Tj 166.5 0 TD 0.0038 Tc 0 Tw (-) Tj 4.5 0 TD -0.2745 Tc 1.712 Tw (1 : i] \() Tj 30 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 33 0 TD 0.375 Tc 0 Tw (7) Tj 5.25 0 TD 0.0076 Tc 0.8227 Tw (\). If there are exactly two distances D) Tj 174 -1.5 TD /F0 6.75 Tf -0.003 Tc 0 Tw (F) Tj 3.75 1.5 TD /F0 11.25 Tf -0.2025 Tc 0.765 Tw (, the ) Tj -417 -25.5 TD 0.04 Tc 0.8975 Tw (BTB is 2) Tj 41.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.1681 Tc 0.7694 Tw (way set) Tj 35.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0761 Tc 1.2011 Tw (associative, and bits used as index are Addr[ i+j) Tj 219.75 0 TD 0.0038 Tc 0 Tw (-) Tj 4.5 0 TD -0.1393 Tc 1.2911 Tw (2 : i]. Similarly, if there are ) Tj -308.25 -25.5 TD -0.0043 Tc 0.9418 Tw (exactly three distances D) Tj 114.75 -1.5 TD /F0 6.75 Tf -0.003 Tc 0 Tw (F) Tj 3.75 1.5 TD /F0 11.25 Tf -0.5625 Tc (, ) Tj 6 0 TD 0.0825 Tc 0.855 Tw (the BTB is 4) Tj 59.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.1681 Tc 0.7694 Tw (way set) Tj 35.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0568 Tc 0.9943 Tw (associative. In general, if there are ) Tj 162 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (m) Tj 7.5 0 TD /F0 11.25 Tf -0.0829 Tc 1.0204 Tw ( \223fitting\224 ) Tj -396 -24.75 TD -0.1086 Tc 0.2961 Tw (distances, the BTB is 2) Tj 102 5.25 TD /F0 6.75 Tf -0.0015 Tc 0 Tw (m) Tj 4.5 0 TD 0.0023 Tc (-) Tj 2.25 0 TD 0.375 Tc (1) Tj 3.75 -5.25 TD /F0 11.25 Tf 0.0038 Tc (-) Tj 3.75 0 TD -0.0879 Tc 0.1087 Tw (way set associative, and the index bits are Addr[ i+j) Tj 228 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.4132 Tc 1.3507 Tw (m : i].) Tj 27 0 TD 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG ET 306 473.25 m 370.5 473.25 l 370.5 483.75 l 306 483.75 l 306 473.25 l h b* BT 1.0029 0 0 1 312 476.25 Tm 0 0 0 rg /F4 9.0408 Tf -0.0149 Tc 0 Tw (DM_Index_T) Tj ET 316.5 526.5 m 316.5 495 l S q 336.75 526.5 36.75 10.5 re h W n BT 1.0029 0 0 1 336.75 528.75 Tm 0.1838 Tc (Distance) Tj ET Q 1 1 1 rg 370.5 505.5 m 381 505.5 l 381 516 l 370.5 516 l 370.5 505.5 l h b* 360 505.5 m 370.5 505.5 l 370.5 516 l 360 516 l 360 505.5 l h b* 327.75 505.5 m 348.75 505.5 l 348.75 516 l 327.75 516 l 327.75 505.5 l h b* BT 1.0029 0 0 1 334.5 507.75 Tm 0 0 0 rg -0.2691 Tc (...) Tj 1 1 1 rg ET 348.75 505.5 m 360 505.5 l 360 516 l 348.75 516 l 348.75 505.5 l h b* 316.5 505.5 m 327.75 505.5 l 327.75 516 l 316.5 516 l 316.5 505.5 l h b* 306 505.5 m 316.5 505.5 l 316.5 516 l 306 516 l 306 505.5 l h b* 295.5 505.5 m 306 505.5 l 306 516 l 295.5 516 l 295.5 505.5 l h b* 263.25 505.5 m 284.25 505.5 l 284.25 516 l 263.25 516 l 263.25 505.5 l h b* BT 1.0029 0 0 1 270 507.75 Tm 0 0 0 rg (...) Tj 1 1 1 rg ET 252 505.5 m 263.25 505.5 l 263.25 516 l 252 516 l 252 505.5 l h b* 284.25 505.5 m 295.5 505.5 l 295.5 516 l 284.25 516 l 284.25 505.5 l h b* 252 526.5 m 252 495 l S 241.5 505.5 m 252 505.5 l 252 516 l 241.5 516 l 241.5 505.5 l h b* 231 505.5 m 241.5 505.5 l 241.5 516 l 231 516 l 231 505.5 l h b* 198.75 505.5 m 219.75 505.5 l 219.75 516 l 198.75 516 l 198.75 505.5 l h b* BT 1.0029 0 0 1 205.5 507.75 Tm 0 0 0 rg (...) Tj 1 1 1 rg ET 219.75 505.5 m 231 505.5 l 231 516 l 219.75 516 l 219.75 505.5 l h b* 187.5 505.5 m 198.75 505.5 l 198.75 516 l 187.5 516 l 187.5 505.5 l h b* q 264 526.5 40.5 10.5 re h W n BT 1.0029 0 0 1 264 528.75 Tm 0 0 0 rg -0.0086 Tc (DM_Index) Tj ET Q 295.5 452.25 m 360 452.25 l 360 462.75 l 295.5 462.75 l 295.5 452.25 l h b* BT 1.0029 0 0 1 301.5 454.5 Tm 0 0 0 rg -0.0149 Tc (DM_Index_T) Tj 1 1 1 rg ET 263.25 420 m 327.75 420 l 327.75 430.5 l 263.25 430.5 l 263.25 420 l h b* BT 1.0029 0 0 1 269.25 422.25 Tm 0 0 0 rg (DM_Index_T) Tj 49.3572 18 TD -0.2691 Tc (...) Tj 1 1 1 rg ET 252 398.25 m 316.5 398.25 l 316.5 409.5 l 252 409.5 l 252 398.25 l h b* BT 1.0029 0 0 1 258.75 401.25 Tm 0 0 0 rg -0.0149 Tc (DM_Index_T) Tj 127.8801 74.25 TD 0.1216 Tc (D=2) Tj 0 -20.25 TD (D=4) Tj 0 -32.25 TD (D=2) Tj ET BT 1.0206 0 0 1 403.5 426 Tm /F4 6.0574 Tf 0.2021 Tc (i-1) Tj ET BT 1.0161 0 0 1 373.5 518.25 Tm /F4 7.5039 Tf 0.2525 Tc (0) Tj -10.3338 0 TD (1) Tj -44.2879 0 TD -0.0713 Tc (i-1) Tj -7.3813 0 TD -0.1866 Tc (i) Tj -55.3599 0 TD 0.0743 Tc (i+j-1) Tj ET BT 1.0029 0 0 1 387 402 Tm /F4 9.0408 Tf 0.1216 Tc (D=2) Tj ET BT 1.0206 0 0 1 403.5 405 Tm /F4 6.0574 Tf 0.1225 Tc (i) Tj ET BT 1.0029 0 0 1 197.25 455.25 Tm /F4 9.0408 Tf 0.1683 Tc -0.4364 Tw (j = log) Tj ET BT 1.0206 0 0 1 221.25 452.25 Tm /F4 6.0574 Tf 0.3002 Tc 0 Tw (2) Tj ET BT 1.0029 0 0 1 225 455.25 Tm /F4 9.0408 Tf 0.2024 Tc (N) Tj ET BT 1.0206 0 0 1 231 452.25 Tm /F4 6.0574 Tf -0.2478 Tc (BTB) Tj ET BT 425.25 397.5 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -255 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (7) Tj 5.25 0 TD -0.0851 Tc -0.0488 Tw ( BTB size and organization: varying the distance.) Tj 231.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -354 -31.5 TD /F6 9.75 Tf 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 352.5 0.75 0.75 re f 81.75 352.5 0.75 0.75 re f 82.5 352.5 443.25 0.75 re f 525.75 352.5 0.75 0.75 re f 525.75 352.5 0.75 0.75 re f 81.75 340.5 0.75 12 re f 525.75 340.5 0.75 12 re f BT 87.75 332.25 TD 0.15 Tc 0 Tw (for \(i=) Tj 42 0 TD (0; i < liter; i++\) {) Tj 120 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 329.25 0.75 11.25 re f 525.75 329.25 0.75 11.25 re f BT 87.75 321 TD 0.15 Tc 0 Tw ( _asm { ) Tj 48 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 318 0.75 11.25 re f 525.75 318 0.75 11.25 re f BT 87.75 309.75 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 306.75 0.75 11.25 re f 525.75 306.75 0.75 11.25 re f BT 87.75 298.5 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 295.5 0.75 11.25 re f 525.75 295.5 0.75 11.25 re f BT 87.75 287.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 283.5 0.75 12 re f 525.75 283.5 0.75 12 re f BT 87.75 275.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (mov eax, 10) Tj 66 0 TD 0 Tc 0.15 Tw ( ) Tj 6 0 TD ( ) Tj ET 81.75 272.25 0.75 11.25 re f 525.75 272.25 0.75 11.25 re f BT 87.75 264 TD ( ) Tj ET 81.75 261 0.75 11.25 re f 525.75 261 0.75 11.25 re f BT 87.75 252.75 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (cmp eax, 15) Tj 66 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 249.75 0.75 11.25 re f 525.75 249.75 0.75 11.25 re f BT 87.75 241.5 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (jle l0) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 238.5 0.75 11.25 re f 525.75 238.5 0.75 11.25 re f BT 87.75 230.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj 12 0 TD ( ) Tj ET 81.75 227.25 0.75 11.25 re f 525.75 227.25 0.75 11.25 re f BT 87.75 219 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 216 0.75 11.25 re f 525.75 216 0.75 11.25 re f BT 87.75 207.75 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 204.75 0.75 11.25 re f 525.75 204.75 0.75 11.25 re f BT 87.75 196.5 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (l0:) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD 0.15 Tc 0 Tw (jle l1) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 193.5 0.75 11.25 re f 525.75 193.5 0.75 11.25 re f BT 87.75 185.25 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 182.25 0.75 11.25 re f 525.75 182.25 0.75 11.25 re f BT 87.75 174 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 170.25 0.75 12 re f 525.75 170.25 0.75 12 re f BT 87.75 162 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 159 0.75 11.25 re f 525.75 159 0.75 11.25 re f BT 87.75 150.75 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (l1:) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj 18 0 TD 0.15 Tc 0 Tw (jle l2) Tj 36 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 147.75 0.75 11.25 re f 525.75 147.75 0.75 11.25 re f BT 87.75 139.5 TD ( ) Tj 36 0 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 136.5 0.75 11.25 re f 525.75 136.5 0.75 11.25 re f BT 87.75 128.25 TD ( ) Tj ET 81.75 125.25 0.75 11.25 re f 525.75 125.25 0.75 11.25 re f BT 87.75 117 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (l510:) Tj 30 0 TD 0 Tc 0.15 Tw ( ) Tj 6 0 TD 0.15 Tc 0 Tw (noop) Tj 24 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 114 0.75 11.25 re f 525.75 114 0.75 11.25 re f BT 87.75 105.75 TD ( ) Tj 36 0 TD 0.15 Tc 0 Tw (}) Tj 6 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 102.75 0.75 11.25 re f 525.75 102.75 0.75 11.25 re f BT 87.75 94.5 TD 0.15 Tc 0 Tw (}) Tj 6 0 TD 0 Tc 0.15 Tw ( ) Tj 30 0 TD ( ) Tj 36 0 TD ( ) Tj 6 0 TD ( ) Tj ET 81.75 90 0.75 0.75 re f 81.75 90 0.75 0.75 re f 82.5 90 443.25 0.75 re f 525.75 90 0.75 0.75 re f 525.75 90 0.75 0.75 re f 81.75 90.75 0.75 12 re f 525.75 90.75 0.75 12 re f BT 186.75 80.25 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD -0.1125 Tc -0.15 Tw ( Benchmark for testing BTB organization.) Tj 198 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 1 1 1 rg 240.75 270.75 120.75 20.25 re f BT 247.5 277.5 TD 0 0 0 rg /F3 9.75 Tf 0.0601 Tc 0.0024 Tw (multiple of distance ) Tj 86.25 0 TD /F1 9.75 Tf -0.2895 Tc 0 Tw (D) Tj 6.75 0 TD /F3 9.75 Tf 0 Tc -0.1875 Tw ( ) Tj ET q 1 0 0 1 -0.0625 -0.0625 cm 237.75 309.75 m 237.75 252.75 l S Q 242.25 307.5 m 238.5 315 l 234 307.5 l h f* 234 254.25 m 238.5 246 l 242.25 254.25 l h f* q 1 0 0 1 -0.0625 -0.0625 cm 237.75 240 m 237.75 211.5 l S Q 242.25 237.75 m 238.5 246 l 234 237.75 l h f* 234 213 m 238.5 205.5 l 242.25 213 l h f* 1 1 1 rg 239.25 175.5 63 18 re f q 246.75 179.25 37.5 10.5 re h W n BT 246.75 181.5 TD 0 0 0 rg 0.0469 Tc -0.2344 Tw (distance ) Tj ET Q q 284.25 179.25 6.75 10.5 re h W n BT 284.25 181.5 TD 0 0 0 rg /F1 9.75 Tf -0.2895 Tc 0 Tw (D) Tj ET Q q 291 179.25 3.75 10.5 re h W n BT 291 181.5 TD 0 0 0 rg ( ) Tj ET Q q 1 0 0 1 -0.0625 -0.0625 cm 211.5 246.75 m 274.5 246.75 l S Q q 1 0 0 1 -0.0625 -0.0625 cm 211.5 205.5 m 274.5 205.5 l S Q q 1 0 0 1 -0.0625 -0.0625 cm 237 200.25 m 237 166.5 l S Q 0 0 0 rg 241.5 198 m 237 205.5 l 233.25 198 l h f* 233.25 168 m 237 160.5 l 241.5 168 l h f* q 1 0 0 1 -0.0625 -0.0625 cm 210.75 160.5 m 273.75 160.5 l S Q 1 1 1 rg 240 216.75 63 18 re f q 247.5 220.5 37.5 10.5 re h W n BT 247.5 222.75 TD 0 0 0 rg 0.0469 Tc -0.2344 Tw (distance ) Tj ET Q q 285 220.5 6.75 10.5 re h W n BT 285 222.75 TD 0 0 0 rg /F1 9.75 Tf -0.2895 Tc 0 Tw (D) Tj ET Q q 291.75 220.5 3.75 10.5 re h W n BT 291.75 222.75 TD 0 0 0 rg ( ) Tj ET Q endstream endobj 65 0 obj 11905 endobj 63 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font 66 0 R /ProcSet 2 0 R >> /Contents 64 0 R >> endobj 66 0 obj << /F0 6 0 R /F1 8 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F6 49 0 R >> endobj 68 0 obj << /Length 69 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.0987 Tc 1.13 Tw (There is one exception to this experiment, and that is the unlikely border case in which low) Tj 420 0 TD 0.0038 Tc 0 Tw (-) Tj -432 -25.5 TD -0.1692 Tc 1.8567 Tw (order address bits are used as the index, i.e., Addr[) Tj 235.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 3 0 TD -0.1275 Tc 0 Tw (j) Tj 2.25 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.375 Tc (1) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 3.75 0 TD -0.1626 Tc 1.743 Tw (:0]. For any degree of associativity, this ) Tj -253.5 -24.75 TD -0.1353 Tc 3.4478 Tw (BTB will have only one \223fitting\224 distance) Tj 0 Tc 0.1875 Tw ( ) Tj 209.25 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (D) Tj 8.25 -1.5 TD /F2 6.75 Tf -0.3743 Tc (F) Tj 4.5 1.5 TD /F2 11.25 Tf -0.9844 Tc (=1) Tj 12.75 0 TD /F0 11.25 Tf -0.1553 Tc 3.3428 Tw (. In this case,) Tj 0 Tc -0.5625 Tw ( ) Tj 72.75 0 TD -0.1453 Tc 2.9578 Tw (an additional experiment is ) Tj -307.5 -25.5 TD -0.0816 Tc 1.0191 Tw (necessary to establish the number of BTB ways. Instead of finding the number of branches that ) Tj 0 -25.5 TD -0.1007 Tc 0.3324 Tw (would fill the whole BTB, this additional experiment finds the number of branches that fill a BTB ) Tj T* -0.0886 Tc 0.8386 Tw (set, and a distance ) Tj 86.25 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (D) Tj 8.25 -1.5 TD /F2 6.75 Tf 0.375 Tc (S) Tj 3.75 0 TD /F0 6.75 Tf 0 Tc -0.1875 Tw ( ) Tj 2.25 1.5 TD /F0 11.25 Tf -0.1476 Tc 1.0851 Tw (such that tho) Tj 57.75 0 TD -0.1645 Tc 0.977 Tw (se branches map into the same set. If there are more branches ) Tj -158.25 -24.75 TD -0.0668 Tc 1.1367 Tw (than ways mapping into the same set, the misprediction rate will be high. The same number of ) Tj 0 -25.5 TD -0.1518 Tc 2.3246 Tw (branches at some other distance might also produce a high MPR, if there are sets where the ) Tj T* -0.2175 Tc 1.53 Tw (number of ) Tj 52.5 0 TD -0.1827 Tc 2.4595 Tw (competing branches is larger than the number of the BTB ways. For example, 16 ) Tj -52.5 -25.5 TD -0.0825 Tc 1.2075 Tw (branches mapping into a 4) Tj 120 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1016 Tc 1.0927 Tw (way set will have a high MPR, as well as 16 branches mapping into ) Tj -123.75 -24.75 TD 0.1875 Tc 0.75 Tw (two 4) Tj 25.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1337 Tc 1.0712 Tw (way sets. If the number of branches is equal or less than the number of ways, th) Tj 364.5 0 TD -0.1425 Tc 1.08 Tw (ey do not ) Tj -393.75 -25.5 TD -0.0781 Tc 0.3656 Tw (collide at any distance. The corresponding microbenchmark is similar to the one described for the ) Tj 0 -25.5 TD -0.0845 Tc 1.772 Tw (previous experiment \() Tj 99 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 33.75 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD -0.1377 Tc 1.8252 Tw (\), but in general, it requires a larger number of runs to establish ) Tj -138 -24.75 TD -0.117 Tc 1.8045 Tw (correct BTB organization, sin) Tj 135.75 0 TD -0.0879 Tc 1.66 Tw (ce both the number of branches fitting in the set and the branch ) Tj -135.75 -25.5 TD -0.133 Tc 0.883 Tw (distance must be varied. ) Tj 112.5 0 TD -0.2706 Tc 0.4581 Tw (Figure ) Tj 31.5 0 TD 0.375 Tc 0 Tw (9) Tj 5.25 0 TD -0.1598 Tc 0.9723 Tw ( shows the search process for the correct number of BTB ways. ) Tj -149.25 -25.5 TD -0.1358 Tc 1.6983 Tw (The algorithm first picks an arbitrarily large number of branches and sets ) Tj 344.25 0 TD -0.1469 Tc 1.4594 Tw (them at the smallest ) Tj -344.25 -25.5 TD -0.109 Tc 0.6715 Tw (possible distance ) Tj 79.5 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (D) Tj 8.25 0 TD /F0 11.25 Tf -0.1309 Tc 0.9684 Tw (. If the MPR is low, the distance is increased and the experiment is repeated. ) Tj -87.75 -24.75 TD -0.1224 Tc 0.4766 Tw (When a high MPR is reached, it means that ) Tj 197.25 0 TD /F2 11.25 Tf -0.1237 Tc 0 Tw (B) Tj 6.75 0 TD /F0 11.25 Tf -0.1016 Tc 0.4937 Tw ( branches collide in the same set, and the number of ) Tj -204 -25.5 TD -0.0931 Tc 2.1556 Tw (branches is decreased. The process stops ) Tj 197.25 0 TD -0.147 Tc 2.397 Tw (when the maximum distance is reached, unless the ) Tj -197.25 -25.5 TD -0.1715 Tc 1.0673 Tw (number of branches picked at the beginning is smaller than the number of ways. In this case, the ) Tj 0 -25.5 TD -0.1035 Tc 2.3679 Tw (MPR is low throughout the series of experiments, and the number of branches ) Tj 379.5 0 TD /F2 11.25 Tf -0.1237 Tc 0 Tw (B) Tj 6.75 0 TD /F0 11.25 Tf -0.1092 Tc 2.0467 Tw ( should be ) Tj -386.25 -24.75 TD -0.0292 Tc 0.2167 Tw (increased. ) Tj 48 0 TD 0 Tc 0.1875 Tw ( ) Tj -36 -25.5 TD ( ) Tj ET endstream endobj 69 0 obj 4732 endobj 67 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R >> /ProcSet 2 0 R >> /Contents 68 0 R >> endobj 71 0 obj << /Length 72 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG ET 219.75 687 m 428.25 687 l 428.25 719.25 l 219.75 719.25 l 219.75 687 l h b* BT 1.0103 0 0 1 249 705.75 Tm 0 0 0 rg /F4 7.9682 Tf 0.0416 Tc -0.0289 Tw (Pick arbitrarily large number of branches B) Tj -20.0439 -9.75 TD 0.0255 Tc -0.0131 Tw (and smallest possible distance D, perform experiment) Tj 1 1 1 rg ET 273 642 m 324 660 l 375 642 l 324 624 l 273 642 l h b* BT 1.0103 0 0 1 285.75 639.75 Tm 0 0 0 rg -0.2093 Tc 0.2169 Tw (High MPR for B at D?) Tj 1 1 1 rg ET 272.25 577.5 m 323.25 595.5 l 374.25 577.5 l 323.25 559.5 l 272.25 577.5 l h b* BT 1.0103 0 0 1 284.25 575.25 Tm 0 0 0 rg -0.1381 Tc 0.1471 Tw (D reached maximum?) Tj 1 1 1 rg ET 262.5 507 m 324 528.75 l 385.5 507 l 324 485.25 l 262.5 507 l h b* BT 1.0103 0 0 1 288.75 509.25 Tm 0 0 0 rg -0.0777 Tc 0.0879 Tw (Was MPR ever high) Tj 2.2271 -9 TD -0.015 Tc 0.0265 Tw (in the experiment?) Tj ET 375 642 m 417.75 642 l 417.75 591.75 l S 420 592.5 m 417.75 588 l 415.5 592.5 l 420 592.5 l h f* 1 1 1 rg 410.25 642.75 m 410.25 632.25 l 425.25 632.25 l 425.25 642.75 l h f* BT 1.0103 0 0 1 411 635.25 Tm 0 0 0 rg 0.1239 Tc 0 Tw (Yes) Tj ET 324 624 m 324 613.5 l 323.25 613.5 l 323.25 599.25 l S 325.5 600 m 323.25 595.5 l 321 600 l 325.5 600 l h f* 1 1 1 rg 318 615.75 m 318 604.5 l 329.25 604.5 l 329.25 615.75 l h f* BT 1.0103 0 0 1 318 607.5 Tm 0 0 0 rg -0.631 Tc (No) Tj ET 323.25 559.5 m 323.25 549 l 324 549 l 324 532.5 l S 326.25 533.25 m 324 528.75 l 321.75 533.25 l 326.25 533.25 l h f* 1 1 1 rg 316.5 549.75 m 316.5 539.25 l 331.5 539.25 l 331.5 549.75 l h f* BT 1.0103 0 0 1 317.25 542.25 Tm 0 0 0 rg 0.1239 Tc (Yes) Tj ET 324 485.25 m 324 462.75 l S 326.25 462.75 m 324 458.25 l 321.75 462.75 l 326.25 462.75 l h f* 1 1 1 rg 316.5 477.75 m 316.5 466.5 l 331.5 466.5 l 331.5 477.75 l h f* BT 1.0103 0 0 1 317.25 469.5 Tm 0 0 0 rg (Yes) Tj ET q 266.25 440.25 116.25 18 re h W n 1 1 1 rg 280.5 458.25 m 367.5 458.25 l 381.75 440.25 l 266.25 440.25 l 280.5 458.25 l h b* Q BT 1.0075 0 0 1 285 450.75 Tm /F5 7.9682 Tf 0.4245 Tc -0.4003 Tw (Number of ways is B) Tj ET 324 687 m 324 663.75 l S 321.75 664.5 m 324 660 l 326.25 664.5 l 321.75 664.5 l h f* 1 1 1 rg 392.25 565.5 m 443.25 565.5 l 443.25 588 l 392.25 588 l 392.25 565.5 l h b* BT 1.0103 0 0 1 396.75 574.5 Tm 0 0 0 rg /F4 7.9682 Tf 0.0744 Tc -0.0611 Tw (Decrease B) Tj ET 417.75 565.5 m 417.75 555 l 453.75 555 l 453.75 673.5 l 327.75 673.5 l S 328.5 671.25 m 324 673.5 l 328.5 675.75 l 328.5 671.25 l h f* 1 1 1 rg 198 534.75 m 252 534.75 l 252 559.5 l 198 559.5 l 198 534.75 l h b* BT 1.0103 0 0 1 205.5 544.5 Tm 0 0 0 rg 0.0888 Tc -0.0751 Tw (Increase D) Tj ET 272.25 577.5 m 225 577.5 l 225 563.25 l S 227.25 564 m 225 559.5 l 222.75 564 l 227.25 564 l h f* 1 1 1 rg 234.75 582.75 m 234.75 572.25 l 246 572.25 l 246 582.75 l h f* BT 1.0103 0 0 1 235.5 575.25 Tm 0 0 0 rg -0.631 Tc 0 Tw (No) Tj ET 225 534.75 m 225 524.25 l 187.5 524.25 l 187.5 667.5 l 324 667.5 l 324 663.75 l S 326.25 668.25 m 324 663.75 l 321.75 668.25 l 326.25 668.25 l h f* 1 1 1 rg 198 458.25 m 255.75 458.25 l 255.75 484.5 l 198 484.5 l 198 458.25 l h b* BT 1.0103 0 0 1 211.5 474 Tm 0 0 0 rg -0.1857 Tc 0.1938 Tw (Reset D,) Tj -3.7118 -9.75 TD 0.1044 Tc -0.0905 Tw (increase B) Tj ET 262.5 507 m 226.5 507 l 226.5 488.25 l S 228.75 489 m 226.5 484.5 l 225 489 l 228.75 489 l h f* 1 1 1 rg 228 512.25 m 228 501.75 l 239.25 501.75 l 239.25 512.25 l h f* BT 1.0103 0 0 1 228.75 504.75 Tm 0 0 0 rg -0.631 Tc 0 Tw (No) Tj ET 226.5 458.25 m 226.5 447.75 l 180 447.75 l 180 674.25 l 320.25 674.25 l S 319.5 676.5 m 324 674.25 l 319.5 672 l 319.5 676.5 l h f* BT 473.25 440.25 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -315 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD 0.375 Tc 0 Tw (9) Tj 5.25 0 TD -0.1496 Tc 0.2621 Tw ( Searching the number of branches that fill a cache set) Tj 255 0 TD 0 Tc 0.1875 Tw ( ) Tj -353.25 -30.75 TD /F0 11.25 Tf -0.197 Tc 1.6702 Tw (A variation of the microbenchmark shown in ) Tj 210.75 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 33.75 0 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD -0.1409 Tc 1.6409 Tw ( can be used to verify the assumption ) Tj -261.75 -25.5 TD -0.1542 Tc 0.4099 Tw (about the number of BTB entries, by increasing the number of br) Tj 288 0 TD -0.0815 Tc 0.119 Tw (anches for the \223fitting\224 distances. ) Tj -288 -25.5 TD -0.0796 Tc 1.0566 Tw (For example, if the actual number of BTB entries is twice as large as the assumed one, and the ) Tj 0 -24.75 TD -0.1418 Tc 0.1418 Tw (previous experiments have found ) Tj 150 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (m) Tj 7.5 0 TD /F0 11.25 Tf 0.0135 Tc 0.174 Tw ( distances D) Tj 55.5 -1.5 TD /F0 6.75 Tf -0.003 Tc 0 Tw (F) Tj 3.75 1.5 TD /F0 11.25 Tf -0.1226 Tc 0.3851 Tw (, the set of experiments with the actual number of ) Tj -216.75 -25.5 TD -0.1394 Tc 0.0769 Tw (entries should find ) Tj 84.75 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (m) Tj 7.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.375 Tc (1) Tj 6 0 TD /F0 11.25 Tf 0.0947 Tc 0.0928 Tw ( such) Tj 23.25 0 TD -0.101 Tc 0.2885 Tw ( distances; i.e., the BTB would be 2) Tj 159 5.25 TD /F0 6.75 Tf -0.0015 Tc 0 Tw (m) Tj 4.5 0 TD 0.0023 Tc (-) Tj 2.25 0 TD 0.375 Tc (2) Tj 3.75 -5.25 TD /F0 11.25 Tf 0.0038 Tc (-) Tj 3.75 0 TD -0.07 Tc 0.1075 Tw (way set associative. In general, ) Tj -298.5 -25.5 TD -0.0775 Tc 1.765 Tw (if the actual number of BTB entries is 2) Tj 187.5 5.25 TD /F0 6.75 Tf 0.375 Tc 0 Tw (n) Tj 3.75 -5.25 TD /F0 11.25 Tf -0.1368 Tc 1.741 Tw ( times greater than the assumed one, the experiments ) Tj -191.25 -24.75 TD -0.1628 Tc 3.3503 Tw (should find) Tj 0 Tc -0.5625 Tw ( ) Tj 58.5 0 TD /F2 11.25 Tf 0.1275 Tc 0 Tw (m) Tj 7.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.375 Tc (n) Tj 6 0 TD /F0 11.25 Tf -0.139 Tc 3.2583 Tw ( \223fitting\224 distances. If the experiments with a larger number of condition) Tj 351.75 0 TD 0.4388 Tc -0.2513 Tw (al ) Tj -427.5 -25.5 TD -0.1407 Tc 0.3282 Tw (branches do not find any such distance, the assumption about the size is correct. ) Tj 355.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -355.5 -39 TD /F1 12 Tf 0.0431 Tc -0.0431 Tw (Outcome Predictor Experiments) Tj 159.75 0 TD 0 Tc 0 Tw ( ) Tj -147.75 -18 TD /F0 11.25 Tf -0.1059 Tc 2.4184 Tw (The set of experiments for uncovering the characteristics of outcome predictor component ) Tj -12 -25.5 TD 0.0038 Tc 0 Tw (\() Tj 3.75 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 32.25 0 TD -0.375 Tc 0 Tw (10) Tj 10.5 0 TD -0.0933 Tc 1.0308 Tw (\) is devised in such ) Tj 93.75 0 TD -0.0315 Tc 1.0267 Tw (a way that all the branches but a few are easily predictable; i.e., ) Tj -140.25 -25.5 TD -0.0666 Tc 1.6964 Tw (those few \223spy\224 branches generate the misprediction rate for the whole microbenchmark. The ) Tj 0 -25.5 TD -0.1574 Tc 0.4074 Tw (microbenchmarks should be carefully tuned to avoid interference between different branches in the) Tj 436.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 72 0 obj 7329 endobj 70 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font 73 0 R /ProcSet 2 0 R >> /Contents 71 0 R >> endobj 73 0 obj << /F0 6 0 R /F1 8 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F5 30 0 R >> endobj 75 0 obj << /Length 76 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.0923 Tc 0.4673 Tw (branch predictor. Since the BTB organization is known from the previous set of experiments, it is ) Tj 0 -25.5 TD -0.1541 Tc 2.4309 Tw (possible to check the assembly code for branch interference and insert dummy instructions if ) Tj 0 -24.75 TD -0.0791 Tc -0.4834 Tw (necessary. ) Tj 48.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -36.75 -25.5 TD /F3 11.25 Tf 0.1769 Tc 0.0106 Tw (Step 1.) Tj 33.75 0 TD /F0 11.25 Tf -0.1314 Tc 1.1626 Tw ( This step determines the maximum length of a) Tj 214.5 0 TD -0.0742 Tc 1.1189 Tw ( local history pattern that the predictor ) Tj -260.25 -25.5 TD -0.1133 Tc 1.3155 Tw (can correctly predict, for just one branch in the loop, i.e., the \223spy\224 branch. The loop condition ) Tj 0 -25.5 TD -0.1486 Tc 0.4244 Tw (branch has just one outcome not taken, when it exits; otherwise it is taken. After enough iterations, ) Tj 0 -24.75 TD -0.0865 Tc 0 Tw (misprediction) Tj 60 0 TD -0.1422 Tc 2.419 Tw ( due to this branch is negligible. For the \223spy\224 branch, different repeating local ) Tj -60 -25.5 TD -0.1355 Tc 1.2605 Tw (history patterns of length ) Tj 117.75 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (LSpy) Tj 23.25 0 TD /F0 11.25 Tf -0.0829 Tc 1.1454 Tw ( can be used; however, the simplest pattern has all outcomes the ) Tj -141 -25.5 TD -0.044 Tc 0.3721 Tw (same but the last one. If \2231\224 means that the branch is taken, and \2230\224 not tak) Tj 340.5 0 TD -0.0487 Tc 0.4237 Tw (en, such local history ) Tj -340.5 -25.5 TD -0.1653 Tc 0.3528 Tw (patterns are 1111...110 and 0000...001.) Tj 171 0 TD 0 Tc 0.1875 Tw ( ) Tj -159 -24.75 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 31.5 0 TD -0.375 Tc 0 Tw (11) Tj 10.5 0 TD -0.1321 Tc 0.2446 Tw (a shows the code for the Step 1 experiment, and ) Tj 215.25 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 32.25 0 TD -0.375 Tc 0 Tw (11) Tj 10.5 0 TD -0.0786 Tc 0.2661 Tw (b shows the fragment of the ) Tj -312 -25.5 TD -0.1016 Tc 0.2891 Tw (corresponding assembly code for Intel x86 architect) Tj 229.5 0 TD -0.1625 Tc 0.1625 Tw (ure, when pattern length ) Tj 109.5 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (LSpy) Tj 23.25 0 TD /F0 11.25 Tf -0.1446 Tc 0.1446 Tw (=4. Note that the ) Tj -362.25 -25.5 TD -0.1343 Tc 0.6968 Tw (\223spy\224 branch ) Tj 63.75 0 TD /F2 11.25 Tf 0.0478 Tc 1.6397 Tw (if \(\(i%4\)==0\)) Tj 65.25 0 TD /F0 11.25 Tf -0.0831 Tc 1.3956 Tw ( is compiled as ) Tj 75.75 0 TD /F2 11.25 Tf 0.1675 Tc 0.02 Tw (jne ) Tj 18.75 0 TD /F0 11.25 Tf 0.0038 Tc 0 Tw (\() Tj 3.75 0 TD /F2 11.25 Tf 0.0126 Tc 1.6749 Tw (jump short if not equal) Tj 108.75 0 TD /F0 11.25 Tf 0.0038 Tc 0 Tw (\)) Tj 3.75 0 TD -0.0935 Tc 1.631 Tw (, so the local history ) Tj -339.75 -24.75 TD -0.0805 Tc 1.1013 Tw (pattern for this branch is 1110. The fragment does not show the loop, which is compiled as the ) Tj 0 -25.5 TD -0.1319 Tc 1.4444 Tw (combination of instr) Tj 91.5 0 TD -0.143 Tc 0.3305 Tw (uctions ) Tj 35.25 0 TD /F2 11.25 Tf 0.4175 Tc 0 Tw (jae) Tj 15 0 TD /F0 11.25 Tf 0.7538 Tc 0.1837 Tw ( \() Tj 7.5 0 TD /F2 11.25 Tf 0.0435 Tc 0.744 Tw (jump short if above or equal) Tj 132 0 TD /F0 11.25 Tf -0.0642 Tc 1.0017 Tw (\) at the beginning of the loop and ) Tj -281.25 -25.5 TD -0.1442 Tc -0.4183 Tw (unconditional ) Tj 63 0 TD /F2 11.25 Tf -0.125 Tc 0 Tw (jmp) Tj 17.25 0 TD /F0 11.25 Tf -0.1733 Tc 0.2358 Tw ( at the end, so the ) Tj 79.5 0 TD /F2 11.25 Tf 0.1675 Tc 0 Tw (jae) Tj 13.5 0 TD /F0 11.25 Tf -0.1924 Tc 0.3799 Tw ( outcome is 0 until the loop exit.) Tj 141.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 76 0 obj 3742 endobj 74 0 obj << /Type /Page /Parent 57 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 75 0 R >> endobj 79 0 obj << /Length 80 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj 0.75 w 1 J 1 j 0 0 0 RG ET 325.5 672 m 325.5 628.5 l S 323.25 629.25 m 325.5 624 l 327.75 629.25 l 323.25 629.25 l h f* BT 332.25 646.5 TD /F4 8.0392 Tf 0.0012 Tc 0.0139 Tw (Pattern length = L) Tj ET q 396 644.25 6 10.5 re h W n BT 396 647.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 300.75 595.5 m 262.5 550.5 l S 261.75 552 m 260.25 546.75 l 264.75 549 l 261.75 552 l h f* 348.75 595.5 m 388.5 550.5 l S 390.75 552 m 391.5 546.75 l 387 549 l 390.75 552 l h f* BT 261.75 572.25 TD -0.3057 Tc 0 Tw (local) Tj ET q 278.25 570 4.5 11.25 re h W n BT 278.25 573 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 373.5 572.25 TD -0.0748 Tc (global) Tj ET q 394.5 570 5.25 11.25 re h W n BT 394.5 573 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 240.75 499.5 m 216 467.25 l S 215.25 468 m 213.75 462.75 l 219 465 l 215.25 468 l h f* BT 208.5 476.25 TD -0.1172 Tc (Yes) Tj ET q 222 474 4.5 11.25 re h W n BT 222 477 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 276.75 499.5 m 313.5 466.5 l S 315.75 468 m 317.25 462.75 l 312.75 464.25 l 315.75 468 l h f* BT 302.25 478.5 TD 0.113 Tc (No) Tj ET q 312.75 476.25 4.5 10.5 re h W n BT 312.75 479.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 218.25 672.75 215.25 46.5 re f 217.5 672.75 216 47.25 re S BT 223.5 702.75 TD 0 0 0 rg 0.0193 Tc 0.1029 Tw (Step 1: What is maximum length of the) Tj ET q 363 700.5 6 10.5 re h W n BT 363 703.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 223.5 692.25 TD 0.0031 Tc 0.012 Tw ("spy" branch pattern that would be correctly predicted) Tj ET q 414 690 6 11.25 re h W n BT 414 693 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 223.5 682.5 TD -0.0297 Tc 0.0448 Tw (when the spy branch is the only branch in a loop?) Tj ET q 399.75 680.25 6 10.5 re h W n BT 399.75 683.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 230.25 595.5 191.25 27.75 re f 229.5 595.5 192 28.5 re S BT 235.5 611.25 TD 0 0 0 rg 0.0015 Tc 0.0137 Tw (Step 2: ) Tj 27.75 0 TD -0.1697 Tc 0 Tw (Are) Tj 12.75 0 TD 0.0046 Tc 0.0106 Tw ( there \(L ) Tj 32.25 0 TD 0.323 Tc 0 Tw (-) Tj 2.25 0 TD -0.0072 Tc 0.1723 Tw ( 1\) bits of local component) Tj ET q 405 609 6 10.5 re h W n BT 405 612 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 235.5 601.5 TD -0.0234 Tc 0.0386 Tw (or \(2*L ) Tj 26.25 0 TD 0.323 Tc 0 Tw (-) Tj 3 0 TD 0.0084 Tc 0.0067 Tw ( 1\) bits of global component?) Tj ET q 368.25 598.5 6 11.25 re h W n BT 368.25 601.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 206.25 499.5 106.5 46.5 re f 205.5 499.5 107.25 47.25 re S BT 211.5 534.75 TD 0 0 0 rg -0.0118 Tc 0.0269 Tw (Step 3: Is there a global) Tj ET q 297 532.5 6 10.5 re h W n BT 297 535.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 211.5 525 TD 0.0204 Tc -0.0053 Tw (component that uses at) Tj ET q 295.5 522.75 6 10.5 re h W n BT 295.5 525.75 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 211.5 514.5 TD 0.0077 Tc 0.0074 Tw (least 2 bits of global) Tj ET q 282.75 512.25 6 10.5 re h W n BT 282.75 515.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 211.5 504.75 TD 0.1381 Tc 0 Tw (history?) Tj ET q 240 501.75 5.25 11.25 re h W n BT 240 504.75 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 166.5 427.5 95.25 34.5 re f 165.75 427.5 96 35.25 re S BT 172.5 451.5 TD 0 0 0 rg -0.0366 Tc 0.0517 Tw (Step 4: How many) Tj ET q 237.75 449.25 6 11.25 re h W n BT 237.75 452.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 172.5 441.75 TD 0.0322 Tc -0.0171 Tw (bits in global history) Tj ET q 243 439.5 6 10.5 re h W n BT 243 442.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 172.5 431.25 TD -0.1414 Tc 0 Tw (register?) Tj ET q 203.25 429 6 11.25 re h W n BT 203.25 432 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 275.25 427.5 82.5 34.5 re f 274.5 427.5 83.25 35.25 re S BT 280.5 451.5 TD 0 0 0 rg -0.0048 Tc 0.0199 Tw (Step 5: 0 or 1 bit in) Tj ET q 348 449.25 6 11.25 re h W n BT 348 452.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 280.5 441.75 TD 0.0482 Tc -0.0331 Tw (global history) Tj ET q 327.75 439.5 6 10.5 re h W n BT 327.75 442.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 280.5 431.25 TD -0.1414 Tc 0 Tw (register?) Tj ET q 311.25 429 6 11.25 re h W n BT 311.25 432 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q 1 1 1 rg 338.25 499.5 107.25 46.5 re f 337.5 499.5 108 47.25 re S BT 343.5 534.75 TD 0 0 0 rg -0.0298 Tc 0.1949 Tw (Step 6: Is there a local) Tj ET q 424.5 532.5 6 10.5 re h W n BT 424.5 535.5 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 343.5 525 TD 0.0204 Tc -0.0053 Tw (component that uses at) Tj ET q 427.5 522.75 6 10.5 re h W n BT 427.5 525.75 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 343.5 514.5 TD 0.0329 Tc -0.0178 Tw (least n bits of local) Tj ET q 410.25 512.25 6 10.5 re h W n BT 410.25 515.25 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 343.5 504.75 TD 0.0444 Tc 0 Tw (history?) Tj ET q 372 501.75 5.25 11.25 re h W n BT 372 504.75 TD /F0 11.9988 Tf 0 Tc 0 Tw ( ) Tj ET Q BT 446.25 426.75 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -258 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD -0.375 Tc 0 Tw (10) Tj 11.25 0 TD -0.1624 Tc 0.1999 Tw ( Experiment flow for outcome predictor.) Tj 189.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -336 -29.25 TD /F6 9 Tf 0 Tc 0.1 Tw (void main\(void\) { ) Tj 96.75 0 TD 0 Tc -0.15 Tw ( ) Tj -96.75 -10.5 TD -0.0429 Tc -0.1071 Tw ( int long ) Tj 53.25 0 TD 0 Tc -0.15 Tw (unsigned i;) Tj 59.25 0 TD 0 Tc -0.15 Tw ( ) Tj -112.5 -9.75 TD -0.0429 Tc -0.1071 Tw ( int a=1;) Tj 48 0 TD 0 Tc -0.15 Tw ( ) Tj -48 -10.5 TD 0 Tc -0.15 Tw ( int long unsigned liter = 10000000;) Tj 193.5 0 TD 0 Tc -0.15 Tw ( ) Tj -193.5 -9.75 TD ( ) Tj 0 -10.5 TD -0.0071 Tc -0.1429 Tw ( for \(i=0; iL) Tj 13.5 0 TD /F0 11.25 Tf -0.1615 Tc 1.1927 Tw (, the \223spy\224 branch is mispredicted once in ) Tj 195 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (LSpy) Tj 23.25 0 TD /F0 11.25 Tf -0.0841 Tc 1.0216 Tw ( times. However, this experiment ) Tj -285.75 -25.5 TD -0.1534 Tc 2.5159 Tw (does not tell whether the predictor has a local prediction co) Tj 281.25 0 TD -0.1333 Tc 2.1208 Tw (mponent with history registers of ) Tj -281.25 -24.75 TD -0.1875 Tc -0.375 Tw (length ) Tj 30.75 0 TD /F2 11.25 Tf -0.255 Tc 0 Tw (L) Tj 6 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.375 Tc (1) Tj 6 0 TD /F0 11.25 Tf -0.1231 Tc 0.9981 Tw (, or a global predictor component with a history register of length ) Tj 302.25 0 TD /F2 11.25 Tf 0.1247 Tc 0 Tw (2*\(L) Tj 21.75 0 TD 0.0038 Tc (-) Tj 3 0 TD 0.1894 Tc (1\)) Tj 9.75 0 TD /F0 11.25 Tf -0.0746 Tc 0.2621 Tw (. Two cases ) Tj -383.25 -25.5 TD -0.1823 Tc 0.2448 Tw (must be considered, as depicted in ) Tj 153 0 TD -0.0206 Tc 0.2081 Tw (Figure ) Tj 32.25 0 TD -0.375 Tc 0 Tw (12) Tj 11.25 0 TD -0.8775 Tc 0.315 Tw (: ) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 80 0 obj 10305 endobj 77 0 obj << /Type /Page /Parent 78 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F6 49 0 R >> /ProcSet 2 0 R >> /Contents 79 0 R >> endobj 82 0 obj << /Length 83 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1356 Tc 0.9898 Tw (\(a\) The outcome predictor has a local history component, ) Tj 263.25 0 TD -0.1643 Tc 1.3518 Tw (so any local pattern of the length) Tj 150 0 TD /F2 11.25 Tf 0.495 Tc 0.4425 Tw ( L ) Tj -425.25 -25.5 TD /F0 11.25 Tf -0.1288 Tc 0.2092 Tw (can be correctly predicted, including the \223spy\224 pattern.) Tj 241.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -229.5 -24.75 TD -0.135 Tc 0.3693 Tw (\(b\) The outcome predictor has a global history component, so the local history pattern 11...10 of ) Tj -12 -25.5 TD -0.1237 Tc 0.6862 Tw (the \223spy\224 branch with ) Tj 102.75 0 TD /F2 11.25 Tf -0.255 Tc 0 Tw (L) Tj 6 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.375 Tc (1) Tj 6 0 TD /F0 11.25 Tf -0.1145 Tc 1.3734 Tw ( 1\222s is correctly predicted, but by usi) Tj 168.75 0 TD -0.0979 Tc 1.1604 Tw (ng the global history of previous ) Tj -287.25 -25.5 TD /F2 11.25 Tf 0.1247 Tc 0 Tw (2*\(L) Tj 21.75 0 TD 0.0038 Tc (-) Tj 3 0 TD 0.1894 Tc (1\)) Tj 9.75 0 TD /F0 11.25 Tf -0.1041 Tc 0.4416 Tw ( branches. Since the microbenchmark has just the loop condition and the \223spy\224 branch, all ) Tj -34.5 -25.5 TD -0.1173 Tc 0.3985 Tw (predictions are correct if all relevant local history fits into the global history register. For example, ) Tj 0 -24.75 TD -0.1017 Tc 0.2892 Tw (just before executi) Tj 81 0 TD -0.1214 Tc 0.3589 Tw (on of the \223spy\224 branch with 0 outcome, the content of the global history register ) Tj -81 -25.5 TD -0.3769 Tc 0.5644 Tw (is ) Tj 9.75 0 TD /F3 11.25 Tf 0.375 Tc 0 Tw (1) Tj ET 97.5 531.75 6 0.75 re f BT 103.5 534 TD /F0 11.25 Tf (0) Tj 5.25 0 TD /F3 11.25 Tf (1) Tj ET 108.75 531.75 6 0.75 re f BT 114.75 534 TD /F0 11.25 Tf (0) Tj 5.25 0 TD /F3 11.25 Tf (1) Tj ET 120 531.75 6 0.75 re f BT 126 534 TD /F0 11.25 Tf 0.0469 Tc (0...) Tj 13.5 0 TD /F3 11.25 Tf 0.375 Tc (1) Tj ET 139.5 531.75 6 0.75 re f BT 145.5 534 TD /F0 11.25 Tf -0.1289 Tc 0.4102 Tw (0, where underlined and bolded 1\222s are outcomes of the \223spy\224 branch, and 0\222s are the ) Tj -57.75 -25.5 TD -0.1642 Tc 0.3517 Tw (outcomes of the loop condition branch.) Tj 172.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 0.698 0.698 0.698 rg 0.75 w 1 J 1 j 0 0 0 RG ET 428.25 413.25 m 428.25 416.25 420 419.25 410.25 419.25 c 400.5 419.25 392.25 416.25 392.25 413.25 c 392.25 409.5 400.5 407.25 410.25 407.25 c 420 407.25 428.25 409.5 428.25 413.25 c h b* 186 445.5 40.5 14.25 re f BT 1.003 0 0 1 189 449.25 Tm 0 0 0 rg /F4 9.5732 Tf -0.0249 Tc 0 Tw (111...10) Tj 0.8667 0.8667 0.8667 rg ET 246 472.5 m 246.75 470.25 l 230.25 460.5 l 231.75 457.5 l 249 467.25 l 249.75 465.75 l 253.5 472.5 l h b* q 1 0 0 1 0 0 cm 342 469.5 m 342 468 342.75 466.5 344.25 466.5 c 351.75 466.5 l 353.25 466.5 354 465.75 354 464.25 c 354 465.75 354.75 466.5 356.25 466.5 c 363.75 466.5 l 365.25 466.5 366 468 366 469.5 c S Q BT 1.003 0 0 1 345 453.75 Tm 0 0 0 rg -0.088 Tc (L) Tj 5.2345 0 TD -0.1962 Tc (-) Tj 2.9911 0 TD -0.088 Tc (1) Tj ET 0.698 0.698 0.698 rg 279.75 471.75 117 14.25 re f BT 1.003 0 0 1 282.75 475.5 Tm 0 0 0 rg 0.0099 Tc 0.0704 Tw (local history: 111...1 ) Tj 87.4902 0 TD /F7 9.5732 Tf 0.2715 Tc 0 Tw (\336) Tj 9.7211 0 TD /F4 9.5732 Tf -0.088 Tc (0) Tj ET q 1 0 0 1 0 0 cm 186 440.25 m 186 439.5 187.5 438 189 438 c 201.75 438 l 203.25 438 204.75 437.25 204.75 435.75 c 204.75 437.25 206.25 438 208.5 438 c 221.25 438 l 222.75 438 224.25 439.5 224.25 440.25 c S Q BT 1.003 0 0 1 201 422.25 Tm (L) Tj 0.8667 0.8667 0.8667 rg ET 246 435 m 246.75 437.25 l 230.25 447 l 231.75 450 l 249 440.25 l 249.75 441.75 l 253.5 435 l h b* 0.698 0.698 0.698 rg 276.75 431.25 144 14.25 re f BT 1.003 0 0 1 279.75 434.25 Tm 0 0 0 rg -0.0223 Tc -0.0219 Tw (global history: ) Tj 62.0657 0 TD 1 1 1 rg /F5 9.5732 Tf -0.088 Tc 0 Tw (1) Tj 0 0 TD 0 0 0 rg (1) Tj ET 342.75 432.75 m 347.25 432.75 l S 1 1 1 rg 342.75 432 5.25 0.75 re f 2 j 1 M 342 433.5 m 347.25 433.5 l S 0 0 0 rg 342 432.75 5.25 0.75 re f BT 1.003 0 0 1 347.25 434.25 Tm /F4 9.5732 Tf (0) Tj 5.9822 0 TD 1 1 1 rg /F5 9.5732 Tf (1) Tj -0.7478 0 TD 0 0 0 rg (1) Tj ET 353.25 432.75 m 357.75 432.75 l S 1 1 1 rg 353.25 432 5.25 0.75 re f 353.25 433.5 m 357.75 433.5 l S 0 0 0 rg 353.25 432.75 5.25 0.75 re f BT 1.003 0 0 1 357.75 434.25 Tm /F4 9.5732 Tf (0) Tj 5.9822 0 TD 1 1 1 rg /F5 9.5732 Tf (1) Tj 0 0 TD 0 0 0 rg (1) Tj ET 363.75 432.75 m 369 432.75 l S 1 1 1 rg 363.75 432 6 0.75 re f 363.75 433.5 m 368.25 433.5 l S 0 0 0 rg 363.75 432.75 5.25 0.75 re f BT 1.003 0 0 1 369 434.25 Tm /F4 9.5732 Tf 0.162 Tc (0...0) Tj 18.6945 0 TD 1 1 1 rg /F5 9.5732 Tf -0.088 Tc (1) Tj 0 0 TD 0 0 0 rg (1) Tj ET 388.5 432.75 m 393 432.75 l S 1 1 1 rg 388.5 432 5.25 0.75 re f 387.75 433.5 m 392.25 433.5 l S 0 0 0 rg 387.75 432.75 5.25 0.75 re f BT 1.003 0 0 1 395.25 434.25 Tm /F7 9.5732 Tf 0.2715 Tc (\336) Tj 9.7211 0 TD /F4 9.5732 Tf -0.088 Tc (0) Tj ET q 1 0 0 1 0 0 cm 1 j 10 M 348.75 426 m 348.75 425.25 351 423.75 353.25 423.75 c 369 423.75 l 371.25 423.75 372.75 422.25 372.75 421.5 c 372.75 422.25 375 423.75 377.25 423.75 c 393 423.75 l 395.25 423.75 396.75 425.25 396.75 426 c S Q BT 1.003 0 0 1 361.5 408 Tm 0.3729 Tc (2\(L) Tj 14.2078 0 TD -0.1962 Tc (-) Tj 2.9911 0 TD -0.1421 Tc (1\)) Tj ET q 1 0 0 1 0 0 cm 1 j 10 M 348 444.75 m 368.25 452.25 l S Q 348 447 m 344.25 443.25 l 349.5 442.5 l h f* q 1 0 0 1 0 0 cm 1 j 10 M 387.75 431.25 m 404.25 419.25 l S Q 390.75 432 m 384.75 433.5 l 387.75 428.25 l h f* BT 1.003 0 0 1 402.75 409.5 Tm /F5 8.0773 Tf 0.0367 Tc (loop) Tj 0.698 0.698 0.698 rg 1 j 10 M ET 426 455.25 m 426 459 412.5 462 396 462 c 379.5 462 366 459 366 455.25 c 366 450.75 379.5 447.75 396 447.75 c 412.5 447.75 426 450.75 426 455.25 c h b* BT 1.003 0 0 1 374.25 452.25 Tm 0.7529 0.7529 0.7529 rg -0.0518 Tc 0.0493 Tw (spy branch) Tj -0.7478 0.75 TD 0 0 0 rg 0.031 Tc -0.033 Tw (spy branch) Tj 2 j 1 M ET 374.25 451.5 m 416.25 451.5 l S 0.7529 0.7529 0.7529 rg 374.25 450 42.75 1.5 re f 374.25 451.5 m 416.25 451.5 l S 0 0 0 rg 374.25 450.75 42.75 0.75 re f BT 1.003 0 0 1 261 435 Tm /F0 9.5732 Tf 0.2665 Tc 0 Tw (\(b\)) Tj 0 40.5 TD 0.1962 Tc (\(a\)) Tj 0.698 0.698 0.698 rg 1 j 10 M ET 428.25 413.25 m 428.25 416.25 420 419.25 410.25 419.25 c 400.5 419.25 392.25 416.25 392.25 413.25 c 392.25 409.5 400.5 407.25 410.25 407.25 c 420 407.25 428.25 409.5 428.25 413.25 c h b* 186 445.5 40.5 14.25 re f BT 1.003 0 0 1 189 449.25 Tm 0 0 0 rg /F4 9.5732 Tf -0.0249 Tc (111...10) Tj 0.8667 0.8667 0.8667 rg ET 246 472.5 m 246.75 470.25 l 230.25 460.5 l 231.75 457.5 l 249 467.25 l 249.75 465.75 l 253.5 472.5 l h b* q 1 0 0 1 0 0 cm 342 469.5 m 342 468 342.75 466.5 344.25 466.5 c 351.75 466.5 l 353.25 466.5 354 465.75 354 464.25 c 354 465.75 354.75 466.5 356.25 466.5 c 363.75 466.5 l 365.25 466.5 366 468 366 469.5 c S Q BT 1.003 0 0 1 345 453.75 Tm 0 0 0 rg -0.088 Tc (L) Tj 5.2345 0 TD -0.1962 Tc (-) Tj 2.9911 0 TD -0.088 Tc (1) Tj ET 0.698 0.698 0.698 rg 279.75 471.75 117 14.25 re f BT 1.003 0 0 1 282.75 475.5 Tm 0 0 0 rg 0.0099 Tc 0.0704 Tw (local history: 111...1 ) Tj 87.4902 0 TD /F7 9.5732 Tf 0.2715 Tc 0 Tw (\336) Tj 9.7211 0 TD /F4 9.5732 Tf -0.088 Tc (0) Tj ET q 1 0 0 1 0 0 cm 186 440.25 m 186 439.5 187.5 438 189 438 c 201.75 438 l 203.25 438 204.75 437.25 204.75 435.75 c 204.75 437.25 206.25 438 208.5 438 c 221.25 438 l 222.75 438 224.25 439.5 224.25 440.25 c S Q BT 1.003 0 0 1 201 422.25 Tm (L) Tj ET q 1 0 0 1 0 0 cm 186 440.25 m 186 439.5 187.5 438 189 438 c 201.75 438 l 203.25 438 204.75 437.25 204.75 435.75 c 204.75 437.25 206.25 438 208.5 438 c 221.25 438 l 222.75 438 224.25 439.5 224.25 440.25 c S Q BT 1.003 0 0 1 201 422.25 Tm (L) Tj 0.8667 0.8667 0.8667 rg ET 246 435 m 246.75 437.25 l 230.25 447 l 231.75 450 l 249 440.25 l 249.75 441.75 l 253.5 435 l h b* 0.698 0.698 0.698 rg 276.75 431.25 144 14.25 re f BT 1.003 0 0 1 279.75 434.25 Tm 0 0 0 rg -0.0223 Tc -0.0219 Tw (global history: ) Tj 62.0657 0 TD 1 1 1 rg /F5 9.5732 Tf -0.088 Tc 0 Tw (1) Tj 0 0 TD 0 0 0 rg (1) Tj 2 j 1 M ET 342.75 432.75 m 347.25 432.75 l S 1 1 1 rg 342.75 432 5.25 0.75 re f 342 433.5 m 347.25 433.5 l S 0 0 0 rg 342 432.75 5.25 0.75 re f BT 1.003 0 0 1 347.25 434.25 Tm /F4 9.5732 Tf (0) Tj 5.9822 0 TD 1 1 1 rg /F5 9.5732 Tf (1) Tj -0.7478 0 TD 0 0 0 rg (1) Tj ET 353.25 432.75 m 357.75 432.75 l S 1 1 1 rg 353.25 432 5.25 0.75 re f 353.25 433.5 m 357.75 433.5 l S 0 0 0 rg 353.25 432.75 5.25 0.75 re f BT 1.003 0 0 1 357.75 434.25 Tm /F4 9.5732 Tf (0) Tj 5.9822 0 TD 1 1 1 rg /F5 9.5732 Tf (1) Tj 0 0 TD 0 0 0 rg (1) Tj ET 363.75 432.75 m 369 432.75 l S 1 1 1 rg 363.75 432 6 0.75 re f 363.75 433.5 m 368.25 433.5 l S 0 0 0 rg 363.75 432.75 5.25 0.75 re f BT 1.003 0 0 1 369 434.25 Tm /F4 9.5732 Tf 0.162 Tc (0...0) Tj 18.6945 0 TD 1 1 1 rg /F5 9.5732 Tf -0.088 Tc (1) Tj 0 0 TD 0 0 0 rg (1) Tj ET 388.5 432.75 m 393 432.75 l S 1 1 1 rg 388.5 432 5.25 0.75 re f 387.75 433.5 m 392.25 433.5 l S 0 0 0 rg 387.75 432.75 5.25 0.75 re f BT 1.003 0 0 1 395.25 434.25 Tm /F7 9.5732 Tf 0.2715 Tc (\336) Tj 9.7211 0 TD /F4 9.5732 Tf -0.088 Tc (0) Tj ET q 1 0 0 1 0 0 cm 1 j 10 M 348.75 426 m 348.75 425.25 351 423.75 353.25 423.75 c 369 423.75 l 371.25 423.75 372.75 422.25 372.75 421.5 c 372.75 422.25 375 423.75 377.25 423.75 c 393 423.75 l 395.25 423.75 396.75 425.25 396.75 426 c S Q BT 1.003 0 0 1 361.5 408 Tm 0.3729 Tc (2\(L) Tj 14.2078 0 TD -0.1962 Tc (-) Tj 2.9911 0 TD -0.1421 Tc (1\)) Tj ET q 1 0 0 1 0 0 cm 1 j 10 M 348 444.75 m 368.25 452.25 l S Q 348 447 m 344.25 443.25 l 349.5 442.5 l h f* q 1 0 0 1 0 0 cm 1 j 10 M 387.75 431.25 m 404.25 419.25 l S Q 390.75 432 m 384.75 433.5 l 387.75 428.25 l h f* BT 1.003 0 0 1 402.75 409.5 Tm /F5 8.0773 Tf 0.0367 Tc (loop) Tj 0.698 0.698 0.698 rg 1 j 10 M ET 426 455.25 m 426 459 412.5 462 396 462 c 379.5 462 366 459 366 455.25 c 366 450.75 379.5 447.75 396 447.75 c 412.5 447.75 426 450.75 426 455.25 c h b* BT 1.003 0 0 1 374.25 452.25 Tm 0.7529 0.7529 0.7529 rg -0.0518 Tc 0.0493 Tw (spy branch) Tj -0.7478 0.75 TD 0 0 0 rg 0.031 Tc -0.033 Tw (spy branch) Tj 2 j 1 M ET 374.25 451.5 m 416.25 451.5 l S 0.7529 0.7529 0.7529 rg 374.25 450 42.75 1.5 re f 374.25 451.5 m 416.25 451.5 l S 0 0 0 rg 374.25 450.75 42.75 0.75 re f BT 1.003 0 0 1 261 435 Tm /F0 9.5732 Tf 0.2665 Tc 0 Tw (\(b\)) Tj 0 40.5 TD 0.1962 Tc (\(a\)) Tj ET BT 429.75 404.25 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -293.25 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD -0.375 Tc 0 Tw (12) Tj 11.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 2.25 0 TD -0.1591 Tc 0.2633 Tw (Two possible cases for maximum predictable pattern length L ) Tj 293.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -192.75 -25.5 TD -0.037 Tc 0.2245 Tw (in Step 1) Tj 42 0 TD 0 Tc 0.1875 Tw ( ) Tj -227.25 -30 TD 0.1769 Tc 0.0106 Tw (Step 2.) Tj 33 0 TD /F0 11.25 Tf -0.1387 Tc 0.9825 Tw ( Step 2 verifies which one of these two hypotheses matches the predictor under test. If ) Tj -45 -25.5 TD -0.173 Tc 1.0271 Tw (the conditional branch in the loop is preceded by ) Tj 224.25 0 TD /F2 11.25 Tf -0.0628 Tc 0 Tw (2*\(L) Tj 21 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.1894 Tc (1\)) Tj 9 0 TD /F0 11.25 Tf -0.1779 Tc 1.1154 Tw ( \223dummy\224 conditional branches, ha) Tj 159 0 TD -0.1256 Tc 0.3131 Tw (ving ) Tj -417 -25.5 TD -0.1446 Tc 0.3762 Tw (always the same outcome, then no local \223spy\224 history is present in the global history register when ) Tj 0 -24.75 TD -0.1208 Tc 1.2314 Tw (the \223spy\224 branch prediction is generated. One example for the \223dummy\224 branch is ) Tj 379.5 0 TD /F2 11.25 Tf 0.0563 Tc 0.3812 Tw (if \(i<0\) a=1 ) Tj -379.5 -25.5 TD /F0 11.25 Tf 0.0038 Tc 0 Tw (\() Tj 3.75 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 32.25 0 TD -0.375 Tc 0 Tw (13) Tj 10.5 0 TD 0.0038 Tc (\)) Tj 3.75 0 TD /F2 11.25 Tf 0.1875 Tc (.) Tj 3 0 TD /F0 11.25 Tf -0.106 Tc 0.9185 Tw ( If the MPR still stays l) Tj 105.75 0 TD -0.1694 Tc 1.0444 Tw (ow, the correct hypothesis is \(a\); i.e., the predictor has a local ) Tj -159 -25.5 TD -0.1641 Tc 3.3516 Tw (history component. The experiment flow proceeds to Step 3, which determines whether the) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -25.5 TD -0.069 Tc 3.2566 Tw (outcome predictor also has a global history component. If the MPR increases, the correct) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -24.75 TD -0.0232 Tc 1.7107 Tw (hypothesis i) Tj 53.25 0 TD -0.1011 Tc 1.1886 Tw (s \(b\); i.e., the predictor has a global history component. In this case, the experiment ) Tj -53.25 -25.5 TD -0.152 Tc 2.4489 Tw (flow proceeds to Step 6 to determine whether the outcome predictor also has a local history ) Tj 0 -25.5 TD -0.1807 Tc 0.3683 Tw (component. ) Tj 54 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 83 0 obj 13261 endobj 81 0 obj << /Type /Page /Parent 78 0 R /Resources << /Font 86 0 R /ProcSet 2 0 R >> /Contents 82 0 R >> endobj 86 0 obj << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F4 26 0 R /F5 30 0 R /F7 84 0 R >> endobj 88 0 obj << /Length 89 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 671.25 TD /F6 9.75 Tf 0.15 Tc (void main\(void\) { ) Tj 108 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 719.25 0.75 0.75 re f 81.75 719.25 0.75 0.75 re f 82.5 719.25 447 0.75 re f 529.5 719.25 0.75 0.75 re f 529.5 719.25 0.75 0.75 re f 81.75 707.25 0.75 12 re f 529.5 707.25 0.75 12 re f BT 87.75 699 TD 0.15 Tc 0 Tw ( int long unsigned i;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 696 0.75 11.25 re f 529.5 696 0.75 11.25 re f BT 87.75 687.75 TD 0.15 Tc 0 Tw ( int a=1;) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 684.75 0.75 11.25 re f 529.5 684.75 0.75 11.25 re f BT 87.75 676.5 TD 0.15 Tc 0 Tw ( int long uns) Tj 78 0 TD (igned liter = 10000000;) Tj 138 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 673.5 0.75 11.25 re f 529.5 673.5 0.75 11.25 re f BT 87.75 665.25 TD 0.15 Tc 0 Tw ( for \(i=0; i> /ProcSet 2 0 R >> /Contents 88 0 R >> endobj 91 0 obj << /Length 92 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F3 11.25 Tf 0.1769 Tc 0.0106 Tw (Step 4.) Tj 33.75 0 TD /F0 11.25 Tf -0.0752 Tc 1.0595 Tw ( This step determines the length of the global history register. The simplest way is to ) Tj -45.75 -25.5 TD 0.0011 Tc 0.9364 Tw (insert \223) Tj 32.25 0 TD -0.1232 Tc 0.9925 Tw (dummy\224 conditional branches \(e.g., pattern 111...11\) before the \223spy\224 conditional branch. ) Tj -32.25 -24.75 TD -0.136 Tc 0.3676 Tw (The \223spy\224 branch is not predicted correctly if the number of \223dummy\224 branches is greater than the ) Tj 0 -25.5 TD -0.2175 Tc 0.405 Tw (number of ) Tj 48 0 TD /F2 11.25 Tf -0.0302 Tc -0.0323 Tw (global history bits ) Tj 84 0 TD 0.375 Tc 0 Tw (\226) Tj 5.25 0 TD -0.1875 Tw ( 2) Tj 9 0 TD /F0 11.25 Tf -0.3019 Tc 0.2394 Tw (, so the ) Tj 33.75 0 TD -0.1701 Tc 0.3576 Tw (number of global history b) Tj 117 0 TD -0.1617 Tc 0.5992 Tw (its is determined by varying the ) Tj -297 -25.5 TD -0.1106 Tc 0.2981 Tw (number of \223dummy\224 branches.) Tj 137.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -125.25 -25.5 TD ( ) Tj -12 -25.5 TD /F6 9.75 Tf 0.15 Tc 0 Tw (void main\(void\){ ) Tj 102 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 567.75 0.75 0.75 re f 81.75 567.75 0.75 0.75 re f 82.5 567.75 447 0.75 re f 529.5 567.75 0.75 0.75 re f 529.5 567.75 0.75 0.75 re f 81.75 555.75 0.75 12 re f 529.5 555.75 0.75 12 re f BT 87.75 547.5 TD 0.15 Tc 0 Tw ( int a,b,c;) Tj 66 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 543.75 0.75 12 re f 529.5 543.75 0.75 12 re f BT 87.75 535.5 TD 0.15 Tc 0 Tw ( int long unsigned i;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 532.5 0.75 11.25 re f 529.5 532.5 0.75 11.25 re f BT 87.75 524.25 TD ( ) Tj 6 0 TD ( ) Tj ET 81.75 521.25 0.75 11.25 re f 529.5 521.25 0.75 11.25 re f BT 87.75 513 TD 0.15 Tc 0 Tw ( for \(i=1;i<=10000000;++i\){ ) Tj 168 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 510 0.75 11.25 re f 529.5 510 0.75 11.25 re f BT 123.75 501.75 TD 0.15 Tc 0 Tw (if \(\(i%L1\) == 0\) a=1;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 498.75 0.75 11.25 re f 529.5 498.75 0.75 11.25 re f BT 87.75 490.5 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (else a=0;) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 487.5 0.75 11.25 re f 529.5 487.5 0.75 11.25 re f BT 87.75 479.25 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (if \(\(i%L2\) == 0\) b=1;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 476.25 0.75 11.25 re f 529.5 476.25 0.75 11.25 re f BT 87.75 468 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (else b=0;) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 465 0.75 11.25 re f 529.5 465 0.75 11.25 re f BT 87.75 456.75 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (if \(i<0\) a=1; //dummy branch) Tj 168 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 453.75 0.75 11.25 re f 529.5 453.75 0.75 11.25 re f BT 87.75 445.5 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (...) Tj 18 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 441.75 0.75 12 re f 529.5 441.75 0.75 12 re f BT 87.75 433.5 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (if) Tj 12 0 TD ( \(i<0\) a=1; //dummy branch) Tj 156 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 430.5 0.75 11.25 re f 529.5 430.5 0.75 11.25 re f BT 87.75 422.25 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (if \(\(a*b\) == 1\) c=1;) Tj 120 0 TD 0 Tc 0.15 Tw ( ) Tj 24 0 TD ( ) Tj ET 81.75 419.25 0.75 11.25 re f 529.5 419.25 0.75 11.25 re f BT 87.75 411 TD 0.15 Tc 0 Tw ( }) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 408 0.75 11.25 re f 529.5 408 0.75 11.25 re f BT 87.75 399.75 TD 0.15 Tc 0 Tw (}) Tj 6 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 395.25 0.75 0.75 re f 81.75 395.25 0.75 0.75 re f 82.5 395.25 447 0.75 re f 529.5 395.25 0.75 0.75 re f 529.5 395.25 0.75 0.75 re f 81.75 396 0.75 12 re f 529.5 396 0.75 12 re f BT 223.5 385.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD -0.375 Tc 0 Tw (15) Tj 11.25 0 TD -0.1277 Tc 0.0652 Tw ( Step 4 microbenchmark.) Tj 118.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -288 -30.75 TD 0.1769 Tc 0.0106 Tw (Step 5. ) Tj 37.5 0 TD /F0 11.25 Tf -0.1251 Tc 1.1698 Tw (The Step 5 microbenchmark has just two conditional branches in the loop, where the ) Tj -49.5 -24.75 TD -0.1114 Tc 0.3926 Tw (first one has the local history pattern 111...110 o) Tj 214.5 0 TD -0.1083 Tc 0.5458 Tw (f a length ) Tj 45.75 0 TD /F2 11.25 Tf -0.0572 Tc 0 Tw (L3>L) Tj 25.5 0 TD /F0 11.25 Tf -0.0925 Tc 0.1862 Tw (, and the second one has the same ) Tj -285.75 -25.5 TD -0.0758 Tc 0.3705 Tw (outcome as the first, as shown in ) Tj 150 0 TD -0.1456 Tc 0 Tw (Figure) Tj 28.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 3.75 0 TD -0.375 Tc 0 Tw (16) Tj 10.5 0 TD -0.0522 Tc 0.3647 Tw (. Since it is known from Step 3 that the predictor does ) Tj -192.75 -25.5 TD -0.1289 Tc 0.4235 Tw (not use more than one global history bit, the first conditional branch is mispredicted ) Tj 375.75 0 TD -0.0208 Tc 0.2083 Tw (once in every ) Tj -375.75 -25.5 TD /F2 11.25 Tf 0.06 Tc 0.1275 Tw (L3 ) Tj 15 0 TD /F0 11.25 Tf -0.1308 Tc 0.4065 Tw (times. If there is no global component at all, the second branch is also mispredicted once in ) Tj 409.5 0 TD /F2 11.25 Tf 0.06 Tc 0.1275 Tw (L3 ) Tj -424.5 -24.75 TD /F0 11.25 Tf -0.1902 Tc 1.9459 Tw (times, while it is always predicted correctly if there is a one) Tj 278.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1744 Tc 1.5619 Tw (bit global history component. The ) Tj -282 -25.5 TD -0.1477 Tc 1.0852 Tw (number of mispredictions in this experiment ) Tj 203.25 0 TD -0.1012 Tc 1.0387 Tw (determines the existence of a one) Tj 152.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1411 Tc 1.0786 Tw (bit global history ) Tj -359.25 -25.5 TD -0.1425 Tc 0 Tw (predictor.) Tj 42 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 92 0 obj 5841 endobj 90 0 obj << /Type /Page /Parent 78 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R /F3 14 0 R /F6 49 0 R >> /ProcSet 2 0 R >> /Contents 91 0 R >> endobj 94 0 obj << /Length 95 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 671.25 TD /F6 9.75 Tf 0.15 Tc (void main\(void\){ ) Tj 102 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 719.25 0.75 0.75 re f 81.75 719.25 0.75 0.75 re f 82.5 719.25 447 0.75 re f 529.5 719.25 0.75 0.75 re f 529.5 719.25 0.75 0.75 re f 81.75 707.25 0.75 12 re f 529.5 707.25 0.75 12 re f BT 87.75 699 TD 0.15 Tc 0 Tw ( int a;) Tj 42 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 696 0.75 11.25 re f 529.5 696 0.75 11.25 re f BT 87.75 687.75 TD 0.15 Tc 0 Tw ( int long unsigned i;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 684.75 0.75 11.25 re f 529.5 684.75 0.75 11.25 re f BT 87.75 676.5 TD 0.15 Tc 0 Tw ( int long unsigned liter = 10000000;) Tj 216 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 673.5 0.75 11.25 re f 529.5 673.5 0.75 11.25 re f BT 87.75 665.25 TD 0.15 Tc 0 Tw ( for \(i=1;i<=liter;++i\){ ) Tj 150 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 661.5 0.75 12 re f 529.5 661.5 0.75 12 re f BT 123.75 653.25 TD 0.15 Tc 0 Tw (if \(\(i%L3\) == 0\) a=1; //L3 > L ) Tj 186 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 650.25 0.75 11.25 re f 529.5 650.25 0.75 11.25 re f BT 87.75 642 TD ( ) Tj 6 0 TD ( ) Tj 30 0 TD 0.15 Tc 0 Tw (if \(\(i%L3\) == 0\) a=1; //spy branch) Tj 204 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 639 0.75 11.25 re f 529.5 639 0.75 11.25 re f BT 87.75 630.75 TD 0.15 Tc 0 Tw ( }) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 627.75 0.75 11.25 re f 529.5 627.75 0.75 11.25 re f BT 87.75 619.5 TD 0.15 Tc 0 Tw (} ) Tj 12 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 615 0.75 0.75 re f 81.75 615 0.75 0.75 re f 82.5 615 447 0.75 re f 529.5 615 0.75 0.75 re f 529.5 615 0.75 0.75 re f 81.75 615.75 0.75 12 re f 529.5 615.75 0.75 12 re f BT 223.5 605.25 TD /F3 11.25 Tf -0.1869 Tc 0 Tw (Figure) Tj 31.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 3 0 TD -0.375 Tc 0 Tw (16) Tj 11.25 0 TD -0.1277 Tc 0.0652 Tw ( Step 5 microbenchmark.) Tj 118.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -288 -30.75 TD 0.1769 Tc -0.7394 Tw (Step 6.) Tj 33 0 TD /F0 11.25 Tf -0.1086 Tc 0.2024 Tw ( The presence of a global component with ) Tj 189.75 0 TD /F2 11.25 Tf 0.1247 Tc 0 Tw (2*\(L) Tj 21.75 0 TD 0.0038 Tc (-) Tj 3.75 0 TD 0.1894 Tc (1\)) Tj 9.75 0 TD /F0 11.25 Tf -0.1328 Tc 0.6953 Tw ( history bits is proved in the previous ) Tj -270 -24.75 TD -0.1067 Tc 0.3825 Tw (steps, and this step probes for the presence of a local component. The Step 6 microbenchmark has ) Tj 0 -25.5 TD /F2 11.25 Tf 0.1247 Tc 0 Tw (2*\(L) Tj 21.75 0 TD 0.0038 Tc (-) Tj 3 0 TD 0.1894 Tc (1\)) Tj 9.75 0 TD /F0 11.25 Tf -0.4369 Tc 0.6244 Tw ( \223dum) Tj 26.25 0 TD -0.0084 Tc 0.1959 Tw (my\224 branches \() Tj 69 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 31.5 0 TD -0.375 Tc 0 Tw (17) Tj 10.5 0 TD -0.0854 Tc 0.3979 Tw (\) and varies the pattern length ) Tj 138 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (LSpy) Tj 23.25 0 TD /F0 11.25 Tf -0.0606 Tc 0.3731 Tw ( of the \223spy\224 branch. If ) Tj -333 -25.5 TD -0.1529 Tc 0.3404 Tw (the MPR is low for some ) Tj 114 0 TD /F2 11.25 Tf 0.1875 Tc 0 Tw (LSpy) Tj 23.25 0 TD /F0 11.25 Tf -0.1701 Tc 0.5076 Tw (, there is an equivalent local component with at least ) Tj 234.75 0 TD /F2 11.25 Tf 0 Tc 0 Tw (LSpy) Tj 22.5 0 TD 0.0038 Tc (-) Tj 3 0 TD 0.375 Tc (1) Tj 6 0 TD /F0 11.25 Tf -0.1789 Tc 0.3664 Tw ( history ) Tj -403.5 -25.5 TD -0.0399 Tc 1.7274 Tw (bits. Depending on the decision mechanism, there c) Tj 240.75 0 TD -0.1054 Tc 1.8867 Tw (ould be more local history bits, so further ) Tj -240.75 -24.75 TD -0.1351 Tc 0.3226 Tw (experiments might be needed. This is outside the scope of this paper.) Tj 304.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -292.5 -25.5 TD ( ) Tj -12 -26.25 TD /F6 9.75 Tf 0.15 Tc 0 Tw (void main\(void\) { ) Tj 108 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 405.75 0.75 0.75 re f 81.75 405.75 0.75 0.75 re f 82.5 405.75 447 0.75 re f 529.5 405.75 0.75 0.75 re f 529.5 405.75 0.75 0.75 re f 81.75 393.75 0.75 12 re f 529.5 393.75 0.75 12 re f BT 87.75 385.5 TD 0.15 Tc 0 Tw ( int long unsigned i;) Tj 126 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 382.5 0.75 11.25 re f 529.5 382.5 0.75 11.25 re f BT 87.75 374.25 TD 0.15 Tc 0 Tw ( int a=1;) Tj 54 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 371.25 0.75 11.25 re f 529.5 371.25 0.75 11.25 re f BT 87.75 363 TD 0.15 Tc 0 Tw ( int long unsigned liter = 10000000;) Tj 216 0 TD 0 Tc 0.15 Tw ( ) Tj ET 81.75 360 0.75 11.25 re f 529.5 360 0.75 11.25 re f BT 87.75 351.75 TD 0.15 Tc 0 Tw ( for \(i=0; i> /ProcSet 2 0 R >> /Contents 94 0 R >> endobj 97 0 obj << /Length 98 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj ET 1 1 1 rg 191.25 567.75 229.5 147.75 re f 0.75 w 1 J 1 j 0 0 0 RG 190.5 567.75 230.25 148.5 re S 0.502 0.502 0.502 rg 244.5 603 m 255.75 618.75 l 390.75 618.75 l 379.5 603 l 244.5 603 l h f* 1 1 1 rg 244.5 603 m 244.5 682.5 l 255.75 698.25 l 255.75 618.75 l 244.5 603 l h f* 255.75 618.75 m 255.75 698.25 l 390.75 698.25 l 390.75 618.75 l 255.75 618.75 l h f* 390.75 618.75 m 379.5 603 l 244.5 603 l 255.75 618.75 l 390.75 618.75 l h S 244.5 603 m 244.5 682.5 l 255.75 698.25 l 255.75 618.75 l 244.5 603 l h S 255.75 618.75 m 255.75 698.25 l 390.75 698.25 l 390.75 618.75 l 255.75 618.75 l h S 0.302 0.102 0.2 rg 266.25 612 m 266.25 611.25 l 266.25 691.5 l 266.25 612 l h f* 0.3176 0.1059 0.2118 rg 266.25 611.25 m 266.25 690.75 l 266.25 691.5 l 266.25 611.25 l h f* 0.3333 0.1098 0.2196 rg 266.25 611.25 m 265.5 610.5 l 265.5 690 l 266.25 690.75 l 266.25 611.25 l h f* 0.349 0.1176 0.2314 rg 265.5 610.5 m 265.5 690 l 265.5 610.5 l h f* 0.3647 0.1216 0.2431 rg 265.5 610.5 m 265.5 609.75 l 265.5 689.25 l 265.5 690 l 265.5 610.5 l h f* 0.3804 0.1255 0.251 rg 265.5 609.75 m 264.75 609.75 l 264.75 689.25 l 265.5 689.25 l 265.5 609.75 l h f* 0.3961 0.1333 0.2627 rg 264.75 609.75 m 264 609 l 264 688.5 l 264.75 689.25 l 264.75 609.75 l h f* 0.4118 0.1373 0.2745 rg 264 609 m 264 688.5 l 264 609 l h f* 0.4275 0.1412 0.2824 rg 264 609 m 263.25 608.25 l 263.25 687.75 l 264 688.5 l 264 609 l h f* 0.4431 0.149 0.2941 rg 263.25 608.25 m 262.5 608.25 l 262.5 687.75 l 263.25 687.75 l 263.25 608.25 l h f* 0.4588 0.1529 0.3059 rg 262.5 608.25 m 261.75 608.25 l 261.75 687.75 l 262.5 687.75 l 262.5 608.25 l h f* 0.4745 0.1569 0.3176 rg 261.75 608.25 m 261 608.25 l 261 687.75 l 261.75 687.75 l 261.75 608.25 l h f* 0.4902 0.1647 0.3255 rg 261 608.25 m 260.25 607.5 l 260.25 687 l 261 687.75 l 261 608.25 l h f* 0.5059 0.1686 0.3373 rg 260.25 607.5 m 260.25 687 l 260.25 607.5 l h f* 0.5216 0.1725 0.349 rg 260.25 607.5 m 258.75 607.5 l 258.75 687 l 260.25 687 l 260.25 607.5 l h f* 0.5373 0.1804 0.3569 rg 258.75 607.5 m 258.75 687 l 258.75 607.5 l h f* 0.5529 0.1843 0.3686 rg 258.75 607.5 m 258 607.5 l 258 687 l 258.75 687 l 258.75 607.5 l h f* 0.5686 0.1882 0.3804 rg 258 607.5 m 257.25 607.5 l 257.25 687 l 258 687 l 258 607.5 l h f* 0.5843 0.1961 0.3882 rg 257.25 607.5 m 257.25 608.25 l 257.25 687.75 l 257.25 687 l 257.25 607.5 l h f* 0.6 0.2 0.4 rg 257.25 608.25 m 256.5 608.25 l 256.5 687.75 l 257.25 687.75 l 257.25 608.25 l h f* 256.5 608.25 m 256.5 687.75 l 256.5 608.25 l h f* 0.5843 0.1961 0.3882 rg 256.5 608.25 m 256.5 688.5 l 256.5 687.75 l 256.5 608.25 l h f* 0.5686 0.1882 0.3804 rg 256.5 608.25 m 256.5 609 l 256.5 688.5 l 256.5 608.25 l h f* 0.5529 0.1843 0.3686 rg 256.5 609 m 256.5 688.5 l 256.5 609 l h f* 0.451 0.149 0.302 rg 263.25 693.75 m 264 693.75 l 264.75 693 l 265.5 693 l 266.25 692.25 l 266.25 690.75 l 265.5 690 l 265.5 689.25 l 264.75 689.25 l 263.25 687.75 l 261 687.75 l 260.25 687 l 257.25 687 l 257.25 687.75 l 256.5 687.75 l 256.5 690 l 258 691.5 l 258.75 691.5 l 258.75 692.25 l 259.5 692.25 l 260.25 693 l 261.75 693 l 262.5 693.75 l 263.25 693.75 l h f* q 1 0 0 1 0 0 cm 2.25 w 266.25 612 m 266.25 611.25 l 265.5 610.5 l 265.5 609.75 l 264.75 609.75 l 263.25 608.25 l 261 608.25 l 260.25 607.5 l 257.25 607.5 l 257.25 608.25 l 256.5 608.25 l 256.5 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 263.25 693.75 m 264 693.75 l 264.75 693 l 265.5 693 l 266.25 692.25 l 266.25 690.75 l 265.5 690 l 265.5 689.25 l 264.75 689.25 l 263.25 687.75 l 261 687.75 l 260.25 687 l 257.25 687 l 257.25 687.75 l 256.5 687.75 l 256.5 690 l 258 691.5 l 258.75 691.5 l 258.75 692.25 l 259.5 692.25 l 260.25 693 l 261.75 693 l 262.5 693.75 l 263.25 693.75 l h S Q q 1 0 0 1 0 0 cm 2.25 w 266.25 612 m 266.25 691.5 l S Q q 1 0 0 1 0 0 cm 2.25 w 256.5 609 m 256.5 688.5 l S Q 0.302 0.102 0.2 rg 288.75 612 m 288.75 611.25 l 288.75 612 l h f* 0.3176 0.1059 0.2118 rg 288.75 611.25 m 288.75 612 l 288.75 611.25 l h f* 0.3333 0.1098 0.2196 rg 288.75 611.25 m 288.75 610.5 l 288.75 611.25 l h f* 0.349 0.1176 0.2314 rg 288.75 610.5 m 288 610.5 l 288.75 610.5 l h f* 0.3647 0.1216 0.2431 rg 288 610.5 m 288 609.75 l 288 610.5 l h f* 0.3804 0.1255 0.251 rg 288 609.75 m 287.25 609.75 l 288 609.75 l h f* 0.3961 0.1333 0.2627 rg 287.25 609.75 m 286.5 609 l 287.25 609.75 l h f* 0.4118 0.1373 0.2745 rg 286.5 609 m h f* 0.4275 0.1412 0.2824 rg 286.5 609 m 285.75 608.25 l 286.5 609 l h f* 0.4431 0.149 0.2941 rg 285.75 608.25 m 285 608.25 l 285.75 608.25 l h f* 0.4588 0.1529 0.3059 rg 285 608.25 m 284.25 608.25 l 285 608.25 l h f* 0.4745 0.1569 0.3176 rg 284.25 608.25 m 283.5 608.25 l 284.25 608.25 l h f* 0.4902 0.1647 0.3255 rg 283.5 608.25 m 283.5 607.5 l 283.5 608.25 l h f* 0.5059 0.1686 0.3373 rg 283.5 607.5 m 282.75 607.5 l 283.5 607.5 l h f* 0.5216 0.1725 0.349 rg 282.75 607.5 m 281.25 607.5 l 282.75 607.5 l h f* 0.5373 0.1804 0.3569 rg 281.25 607.5 m h f* 0.5529 0.1843 0.3686 rg 281.25 607.5 m 280.5 607.5 l 281.25 607.5 l h f* 0.5686 0.1882 0.3804 rg 280.5 607.5 m 280.5 608.25 l 280.5 607.5 l h f* 0.5843 0.1961 0.3882 rg 280.5 607.5 m 279.75 608.25 l 280.5 608.25 l 280.5 607.5 l h f* 0.6 0.2 0.4 rg 279.75 608.25 m h f* 279.75 608.25 m 279 608.25 l 279.75 608.25 l h f* 0.5843 0.1961 0.3882 rg 279 608.25 m 279 609 l 279 608.25 l h f* 0.5686 0.1882 0.3804 rg 279 608.25 m 279 609 l 279 608.25 l h f* 0.5529 0.1843 0.3686 rg 279 609 m 279 609.75 l 279 609 l h f* 0.451 0.149 0.302 rg 286.5 614.25 m 287.25 614.25 l 288 613.5 l 288.75 613.5 l 288.75 610.5 l 288 610.5 l 288 609.75 l 287.25 609.75 l 285.75 608.25 l 283.5 608.25 l 283.5 607.5 l 280.5 607.5 l 280.5 608.25 l 279.75 608.25 l 279 608.25 l 279 610.5 l 279.75 610.5 l 279.75 611.25 l 280.5 611.25 l 280.5 612 l 281.25 612 l 281.25 612.75 l 282 612.75 l 282.75 613.5 l 284.25 613.5 l 284.25 614.25 l 286.5 614.25 l h f* q 1 0 0 1 0 0 cm 2.25 w 288.75 612 m 288.75 610.5 l 288 610.5 l 288 609.75 l 287.25 609.75 l 285.75 608.25 l 283.5 608.25 l 283.5 607.5 l 280.5 607.5 l 279.75 608.25 l 279 608.25 l 279 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 286.5 614.25 m 287.25 614.25 l 288 613.5 l 288.75 613.5 l 288.75 610.5 l 288 610.5 l 288 609.75 l 287.25 609.75 l 285.75 608.25 l 283.5 608.25 l 283.5 607.5 l 280.5 607.5 l 280.5 608.25 l 279.75 608.25 l 279 608.25 l 279 610.5 l 279.75 610.5 l 279.75 611.25 l 280.5 611.25 l 280.5 612 l 281.25 612 l 281.25 612.75 l 282 612.75 l 282.75 613.5 l 284.25 613.5 l 284.25 614.25 l 286.5 614.25 l h S Q q 1 0 0 1 0 0 cm 2.25 w 288.75 612 m 288.75 612 l S Q q 1 0 0 1 0 0 cm 2.25 w 279 609 m 279 609.75 l S Q 0.302 0.102 0.2 rg 311.25 612 m 311.25 611.25 l 311.25 612 l h f* 0.3176 0.1059 0.2118 rg 311.25 611.25 m 311.25 612 l 311.25 611.25 l h f* 0.3333 0.1098 0.2196 rg 311.25 611.25 m 311.25 610.5 l 311.25 611.25 l h f* 0.349 0.1176 0.2314 rg 311.25 610.5 m 310.5 610.5 l 311.25 610.5 l h f* 0.3647 0.1216 0.2431 rg 310.5 610.5 m 310.5 609.75 l 310.5 610.5 l h f* 0.3804 0.1255 0.251 rg 310.5 609.75 m 309.75 609.75 l 310.5 609.75 l h f* 0.3961 0.1333 0.2627 rg 309.75 609.75 m 309 609 l 309.75 609.75 l h f* 0.4118 0.1373 0.2745 rg 309 609 m h f* 0.4275 0.1412 0.2824 rg 309 609 m 308.25 608.25 l 309 609 l h f* 0.4431 0.149 0.2941 rg 308.25 608.25 m 307.5 608.25 l 308.25 608.25 l h f* 0.4588 0.1529 0.3059 rg 307.5 608.25 m 306.75 608.25 l 307.5 608.25 l h f* 0.4745 0.1569 0.3176 rg 306.75 608.25 m 306 608.25 l 306.75 608.25 l h f* 0.4902 0.1647 0.3255 rg 306 608.25 m 306 607.5 l 306 608.25 l h f* 0.5059 0.1686 0.3373 rg 306 607.5 m 305.25 607.5 l 306 607.5 l h f* 0.5216 0.1725 0.349 rg 305.25 607.5 m 304.5 607.5 l 305.25 607.5 l h f* 0.5373 0.1804 0.3569 rg 304.5 607.5 m 303.75 607.5 l 304.5 607.5 l h f* 0.5529 0.1843 0.3686 rg 303.75 607.5 m 303 607.5 l 303.75 607.5 l h f* 0.5686 0.1882 0.3804 rg 303 607.5 m 303 608.25 l 303 607.5 l h f* 0.5843 0.1961 0.3882 rg 303 607.5 m 302.25 608.25 l 303 608.25 l 303 607.5 l h f* 0.6 0.2 0.4 rg 302.25 608.25 m h f* 302.25 608.25 m 301.5 608.25 l 302.25 608.25 l h f* 0.5843 0.1961 0.3882 rg 301.5 608.25 m 301.5 609 l 301.5 608.25 l h f* 0.5686 0.1882 0.3804 rg 301.5 608.25 m 301.5 609 l 301.5 608.25 l h f* 0.5529 0.1843 0.3686 rg 301.5 609 m 301.5 609.75 l 301.5 609 l h f* 0.451 0.149 0.302 rg 309 614.25 m 309.75 614.25 l 310.5 613.5 l 311.25 613.5 l 311.25 610.5 l 310.5 610.5 l 310.5 609.75 l 309.75 609.75 l 308.25 608.25 l 306 608.25 l 306 607.5 l 303 607.5 l 303 608.25 l 302.25 608.25 l 301.5 608.25 l 301.5 610.5 l 302.25 610.5 l 302.25 611.25 l 303 612 l 303.75 612 l 303.75 612.75 l 304.5 612.75 l 305.25 613.5 l 306.75 613.5 l 307.5 614.25 l 309 614.25 l h f* q 1 0 0 1 0 0 cm 2.25 w 311.25 612 m 311.25 610.5 l 310.5 610.5 l 310.5 609.75 l 309.75 609.75 l 308.25 608.25 l 306 608.25 l 306 607.5 l 303 607.5 l 302.25 608.25 l 301.5 608.25 l 301.5 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 309 614.25 m 309.75 614.25 l 310.5 613.5 l 311.25 613.5 l 311.25 610.5 l 310.5 610.5 l 310.5 609.75 l 309.75 609.75 l 308.25 608.25 l 306 608.25 l 306 607.5 l 303 607.5 l 303 608.25 l 302.25 608.25 l 301.5 608.25 l 301.5 610.5 l 302.25 610.5 l 302.25 611.25 l 303 612 l 303.75 612 l 303.75 612.75 l 304.5 612.75 l 305.25 613.5 l 306.75 613.5 l 307.5 614.25 l 309 614.25 l h S Q q 1 0 0 1 0 0 cm 2.25 w 311.25 612 m 311.25 612 l S Q q 1 0 0 1 0 0 cm 2.25 w 301.5 609 m 301.5 609.75 l S Q 0.302 0.102 0.2 rg 333.75 612 m 333.75 611.25 l 333.75 612 l h f* 0.3176 0.1059 0.2118 rg 333.75 611.25 m 333.75 612 l 333.75 611.25 l h f* 0.3333 0.1098 0.2196 rg 333.75 611.25 m 333.75 610.5 l 333.75 611.25 l h f* 0.349 0.1176 0.2314 rg 333.75 610.5 m 333 610.5 l 333.75 611.25 l 333.75 610.5 l h f* 0.3647 0.1216 0.2431 rg 333 610.5 m 333 609.75 l 333 610.5 l h f* 0.3804 0.1255 0.251 rg 333 609.75 m 332.25 609.75 l 333 609.75 l h f* 0.3961 0.1333 0.2627 rg 332.25 609.75 m 332.25 609 l 332.25 609.75 l h f* 0.4118 0.1373 0.2745 rg 332.25 609 m 331.5 609 l 332.25 609 l h f* 0.4275 0.1412 0.2824 rg 331.5 609 m 330.75 608.25 l 330.75 609 l 331.5 609 l h f* 0.4431 0.149 0.2941 rg 330.75 608.25 m 330 608.25 l 330.75 609 l 330.75 608.25 l h f* 0.4588 0.1529 0.3059 rg 330 608.25 m 329.25 608.25 l 330 608.25 l h f* 0.4745 0.1569 0.3176 rg 329.25 608.25 m h f* 0.4902 0.1647 0.3255 rg 329.25 608.25 m 328.5 607.5 l 329.25 608.25 l h f* 0.5059 0.1686 0.3373 rg 328.5 607.5 m 327.75 607.5 l 328.5 607.5 l h f* 0.5216 0.1725 0.349 rg 327.75 607.5 m 327 607.5 l 327.75 607.5 l h f* 0.5373 0.1804 0.3569 rg 327 607.5 m 326.25 607.5 l 327 607.5 l h f* 0.5529 0.1843 0.3686 rg 326.25 607.5 m 325.5 607.5 l 326.25 607.5 l h f* 0.5686 0.1882 0.3804 rg 325.5 607.5 m 325.5 608.25 l 325.5 607.5 l h f* 0.5843 0.1961 0.3882 rg 325.5 607.5 m 324.75 608.25 l 325.5 608.25 l 325.5 607.5 l h f* 0.6 0.2 0.4 rg 324.75 608.25 m h f* 324.75 608.25 m 324 608.25 l 324.75 608.25 l h f* 0.5843 0.1961 0.3882 rg 324 608.25 m 324 609 l 324 608.25 l h f* 0.5686 0.1882 0.3804 rg 324 608.25 m 324 609 l 324 608.25 l h f* 0.5529 0.1843 0.3686 rg 324 609 m 324 609.75 l 324 609 l h f* 0.451 0.149 0.302 rg 331.5 614.25 m 332.25 614.25 l 333 613.5 l 333.75 613.5 l 333.75 611.25 l 333 610.5 l 333 609.75 l 332.25 609.75 l 332.25 609 l 330.75 609 l 330 608.25 l 329.25 608.25 l 328.5 607.5 l 325.5 607.5 l 325.5 608.25 l 324.75 608.25 l 324 608.25 l 324 610.5 l 324.75 611.25 l 325.5 611.25 l 325.5 612 l 326.25 612 l 326.25 612.75 l 327 612.75 l 327.75 613.5 l 329.25 613.5 l 330 614.25 l 331.5 614.25 l h f* q 1 0 0 1 0 0 cm 2.25 w 333.75 612 m 333.75 610.5 l 333 610.5 l 333 609.75 l 332.25 609.75 l 332.25 609 l 331.5 609 l 330.75 608.25 l 329.25 608.25 l 328.5 607.5 l 325.5 607.5 l 324.75 608.25 l 324 608.25 l 324 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 331.5 614.25 m 332.25 614.25 l 333 613.5 l 333.75 613.5 l 333.75 611.25 l 333 610.5 l 333 609.75 l 332.25 609.75 l 332.25 609 l 330.75 609 l 330 608.25 l 329.25 608.25 l 328.5 607.5 l 325.5 607.5 l 325.5 608.25 l 324.75 608.25 l 324 608.25 l 324 610.5 l 324.75 611.25 l 325.5 611.25 l 325.5 612 l 326.25 612 l 326.25 612.75 l 327 612.75 l 327.75 613.5 l 329.25 613.5 l 330 614.25 l 331.5 614.25 l h S Q q 1 0 0 1 0 0 cm 2.25 w 333.75 612 m 333.75 612 l S Q q 1 0 0 1 0 0 cm 2.25 w 324 609 m 324 609.75 l S Q 0.302 0.102 0.2 rg 356.25 612 m 356.25 611.25 l 356.25 691.5 l 356.25 612 l h f* 0.3176 0.1059 0.2118 rg 356.25 611.25 m 356.25 690.75 l 356.25 691.5 l 356.25 611.25 l h f* 0.3333 0.1098 0.2196 rg 356.25 611.25 m 356.25 610.5 l 356.25 690 l 356.25 690.75 l 356.25 611.25 l h f* 0.349 0.1176 0.2314 rg 356.25 610.5 m 355.5 610.5 l 355.5 690 l 356.25 690 l 356.25 610.5 l h f* 0.3647 0.1216 0.2431 rg 355.5 610.5 m 355.5 609.75 l 355.5 689.25 l 355.5 690 l 355.5 610.5 l h f* 0.3804 0.1255 0.251 rg 355.5 609.75 m 354.75 609.75 l 354.75 689.25 l 355.5 689.25 l 355.5 609.75 l h f* 0.3961 0.1333 0.2627 rg 354.75 609.75 m 354.75 609 l 354.75 688.5 l 354.75 689.25 l 354.75 609.75 l h f* 0.4118 0.1373 0.2745 rg 354.75 609 m 354 609 l 354 688.5 l 354.75 688.5 l 354.75 609 l h f* 0.4275 0.1412 0.2824 rg 354 609 m 353.25 608.25 l 353.25 687.75 l 354 688.5 l 354 609 l h f* 0.4431 0.149 0.2941 rg 353.25 608.25 m 352.5 608.25 l 352.5 687.75 l 353.25 687.75 l 353.25 608.25 l h f* 0.4588 0.1529 0.3059 rg 352.5 608.25 m 351.75 608.25 l 351.75 687.75 l 352.5 687.75 l 352.5 608.25 l h f* 0.4745 0.1569 0.3176 rg 351.75 608.25 m 351.75 687.75 l 351.75 608.25 l h f* 0.4902 0.1647 0.3255 rg 351.75 608.25 m 351 607.5 l 351 687 l 351.75 687.75 l 351.75 608.25 l h f* 0.5059 0.1686 0.3373 rg 351 607.5 m 350.25 607.5 l 350.25 687 l 351 687 l 351 607.5 l h f* 0.5216 0.1725 0.349 rg 350.25 607.5 m 349.5 607.5 l 349.5 687 l 350.25 687 l 350.25 607.5 l h f* 0.5373 0.1804 0.3569 rg 349.5 607.5 m 348.75 607.5 l 348.75 687 l 349.5 687 l 349.5 607.5 l h f* 0.5529 0.1843 0.3686 rg 348.75 607.5 m 348 607.5 l 348 687 l 348.75 687 l 348.75 607.5 l h f* 0.5686 0.1882 0.3804 rg 348 607.5 m 348 687 l 348 607.5 l h f* 0.5843 0.1961 0.3882 rg 348 607.5 m 347.25 608.25 l 347.25 687.75 l 348 687 l 348 607.5 l h f* 0.6 0.2 0.4 rg 347.25 608.25 m 347.25 687.75 l 347.25 608.25 l h f* 347.25 608.25 m 346.5 608.25 l 346.5 687.75 l 347.25 687.75 l 347.25 608.25 l h f* 0.5843 0.1961 0.3882 rg 346.5 608.25 m 346.5 688.5 l 346.5 687.75 l 346.5 608.25 l h f* 0.5686 0.1882 0.3804 rg 346.5 608.25 m 346.5 609 l 346.5 688.5 l 346.5 608.25 l h f* 0.5529 0.1843 0.3686 rg 346.5 609 m 346.5 688.5 l 346.5 609 l h f* 0.451 0.149 0.302 rg 354 693.75 m 354.75 693 l 356.25 693 l 356.25 690 l 355.5 690 l 355.5 689.25 l 354.75 689.25 l 354.75 688.5 l 354 688.5 l 353.25 687.75 l 351.75 687.75 l 351 687 l 348 687 l 347.25 687.75 l 346.5 687.75 l 346.5 690 l 347.25 690 l 347.25 690.75 l 348 690.75 l 348 691.5 l 348.75 691.5 l 350.25 693 l 352.5 693 l 353.25 693.75 l 354 693.75 l h f* q 1 0 0 1 0 0 cm 2.25 w 356.25 612 m 356.25 610.5 l 355.5 610.5 l 355.5 609.75 l 354.75 609.75 l 354.75 609 l 354 609 l 353.25 608.25 l 351.75 608.25 l 351 607.5 l 348 607.5 l 347.25 608.25 l 346.5 608.25 l 346.5 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 354 693.75 m 354.75 693 l 356.25 693 l 356.25 690 l 355.5 690 l 355.5 689.25 l 354.75 689.25 l 354.75 688.5 l 354 688.5 l 353.25 687.75 l 351.75 687.75 l 351 687 l 348 687 l 347.25 687.75 l 346.5 687.75 l 346.5 690 l 347.25 690 l 347.25 690.75 l 348 690.75 l 348 691.5 l 348.75 691.5 l 350.25 693 l 352.5 693 l 353.25 693.75 l 354 693.75 l h S Q q 1 0 0 1 0 0 cm 2.25 w 356.25 612 m 356.25 691.5 l S Q q 1 0 0 1 0 0 cm 2.25 w 346.5 609 m 346.5 688.5 l S Q 0.302 0.102 0.2 rg 379.5 612 m 378.75 611.25 l 378.75 691.5 l 379.5 691.5 l 379.5 612 l h f* 0.3176 0.1059 0.2118 rg 378.75 611.25 m 378.75 690.75 l 378.75 691.5 l 378.75 611.25 l h f* 0.3333 0.1098 0.2196 rg 378.75 611.25 m 378.75 610.5 l 378.75 690 l 378.75 690.75 l 378.75 611.25 l h f* 0.349 0.1176 0.2314 rg 378.75 610.5 m 378 610.5 l 378 690 l 378.75 690 l 378.75 610.5 l h f* 0.3647 0.1216 0.2431 rg 378 610.5 m 378 609.75 l 378 689.25 l 378 690 l 378 610.5 l h f* 0.3804 0.1255 0.251 rg 378 609.75 m 377.25 609.75 l 377.25 689.25 l 378 689.25 l 378 609.75 l h f* 0.3961 0.1333 0.2627 rg 377.25 609.75 m 377.25 609 l 377.25 688.5 l 377.25 689.25 l 377.25 609.75 l h f* 0.4118 0.1373 0.2745 rg 377.25 609 m 376.5 609 l 376.5 688.5 l 377.25 688.5 l 377.25 609 l h f* 0.4275 0.1412 0.2824 rg 376.5 609 m 375.75 608.25 l 375.75 687.75 l 376.5 688.5 l 376.5 609 l h f* 0.4431 0.149 0.2941 rg 375.75 608.25 m 375 608.25 l 375 687.75 l 375.75 687.75 l 375.75 608.25 l h f* 0.4588 0.1529 0.3059 rg 375 608.25 m 374.25 608.25 l 374.25 687.75 l 375 687.75 l 375 608.25 l h f* 0.4745 0.1569 0.3176 rg 374.25 608.25 m 374.25 687.75 l 374.25 608.25 l h f* 0.4902 0.1647 0.3255 rg 374.25 608.25 m 373.5 607.5 l 373.5 687 l 374.25 687.75 l 374.25 608.25 l h f* 0.5059 0.1686 0.3373 rg 373.5 607.5 m 372.75 607.5 l 372.75 687 l 373.5 687 l 373.5 607.5 l h f* 0.5216 0.1725 0.349 rg 372.75 607.5 m 372 607.5 l 372 687 l 372.75 687 l 372.75 607.5 l h f* 0.5373 0.1804 0.3569 rg 372 607.5 m 371.25 607.5 l 371.25 687 l 372 687 l 372 607.5 l h f* 0.5529 0.1843 0.3686 rg 371.25 607.5 m 370.5 607.5 l 370.5 687 l 371.25 687 l 371.25 607.5 l h f* 0.5686 0.1882 0.3804 rg 370.5 607.5 m 370.5 687 l 370.5 607.5 l h f* 0.5843 0.1961 0.3882 rg 370.5 607.5 m 369.75 608.25 l 369.75 687.75 l 370.5 687 l 370.5 607.5 l h f* 0.6 0.2 0.4 rg 369.75 608.25 m 369.75 687.75 l 369.75 608.25 l h f* 369.75 608.25 m 369 608.25 l 369 687.75 l 369.75 687.75 l 369.75 608.25 l h f* 0.5843 0.1961 0.3882 rg 369 608.25 m 369 688.5 l 369 687.75 l 369 608.25 l h f* 0.5686 0.1882 0.3804 rg 369 608.25 m 369 609 l 369 688.5 l 369 608.25 l h f* 0.5529 0.1843 0.3686 rg 369 609 m 369 688.5 l 369 609 l h f* 0.451 0.149 0.302 rg 376.5 693.75 m 377.25 693.75 l 377.25 693 l 378.75 693 l 378.75 692.25 l 379.5 691.5 l 378.75 691.5 l 378.75 690 l 378 690 l 378 689.25 l 377.25 689.25 l 377.25 688.5 l 376.5 688.5 l 375.75 687.75 l 374.25 687.75 l 373.5 687 l 370.5 687 l 369.75 687.75 l 369 687.75 l 369 690 l 369.75 690 l 369.75 690.75 l 370.5 690.75 l 370.5 691.5 l 371.25 691.5 l 372.75 693 l 375 693 l 375.75 693.75 l 376.5 693.75 l h f* q 1 0 0 1 0 0 cm 2.25 w 379.5 612 m 378.75 611.25 l 378.75 610.5 l 378 610.5 l 378 609.75 l 377.25 609.75 l 377.25 609 l 376.5 609 l 375.75 608.25 l 374.25 608.25 l 373.5 607.5 l 370.5 607.5 l 369.75 608.25 l 369 608.25 l 369 609 l S Q q 1 0 0 1 0 0 cm 2.25 w 376.5 693.75 m 377.25 693.75 l 377.25 693 l 378.75 693 l 378.75 692.25 l 379.5 691.5 l 378.75 691.5 l 378.75 690 l 378 690 l 378 689.25 l 377.25 689.25 l 377.25 688.5 l 376.5 688.5 l 375.75 687.75 l 374.25 687.75 l 373.5 687 l 370.5 687 l 369.75 687.75 l 369 687.75 l 369 690 l 369.75 690 l 369.75 690.75 l 370.5 690.75 l 370.5 691.5 l 371.25 691.5 l 372.75 693 l 375 693 l 375.75 693.75 l 376.5 693.75 l h S Q q 1 0 0 1 0 0 cm 2.25 w 379.5 612 m 379.5 691.5 l S Q q 1 0 0 1 0 0 cm 2.25 w 369 609 m 369 688.5 l S Q 244.5 603 m 244.5 682.5 l S 244.5 603 m 242.25 603 l S 244.5 642.75 m 242.25 642.75 l S 244.5 682.5 m 242.25 682.5 l S BT 229.5 599.25 TD 0 0 0 rg /F4 8.2626 Tf 0.0303 Tc (0%) Tj -4.5 39.75 TD -0.0112 Tc (50%) Tj -4.5 39.75 TD -0.0319 Tc (100%) Tj ET 244.5 603 m 379.5 603 l S 244.5 603 m 244.5 600.75 l S 267 603 m 267 600.75 l S 289.5 603 m 289.5 600.75 l S 312 603 m 312 600.75 l S 334.5 603 m 334.5 600.75 l S 357 603 m 357 600.75 l S 379.5 603 m 379.5 600.75 l S BT 253.5 591 TD -0.094 Tc (2) Tj 22.5 0 TD (4) Tj 22.5 0 TD (8) Tj 20.25 0 TD (16) Tj 22.5 0 TD (32) Tj 22.5 0 TD (64) Tj -70.5 -20.25 TD /F4 8.7416 Tf 0.0619 Tc (Distance) Tj -21.75 132.75 TD /F4 10.5378 Tf 0.0456 Tc 0.0249 Tw (Misprediction rate) Tj ET 190.5 567.75 230.25 148.5 re S BT 425.25 563.25 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -301.5 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD -0.375 Tc 0 Tw (18) Tj 11.25 0 TD -0.1357 Tc 0.1357 Tw ( Misprediction rate for N) Tj 117 -1.5 TD /F3 6.75 Tf 0.2478 Tc 0 Tw (BTB) Tj 15 1.5 TD /F3 11.25 Tf -0.0665 Tc -0.121 Tw ( conditional branches, varying distance.) Tj 186.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -388.5 -30.75 TD /F0 11.25 Tf -0.0209 Tc 1.7834 Tw (This result can be also obtained by trying to map B) Tj 245.25 -1.5 TD /F0 6.75 Tf 0 Tc -0.1875 Tw ( ) Tj 3.75 1.5 TD /F0 11.25 Tf 0.0098 Tc 1.6777 Tw (branches in the sa) Tj 85.5 0 TD -0.0422 Tc 1.5422 Tw (me set, varying the ) Tj -346.5 -25.5 TD -0.067 Tc 1.7545 Tw (distance between them and the number of branches \() Tj 246 0 TD -0.3232 Tc 0.5107 Tw (Table ) Tj 28.5 0 TD 0.375 Tc 0 Tw (2) Tj 5.25 0 TD -0.0932 Tc 1.4994 Tw (\). It can be seen that 16 branches ) Tj -279.75 -24.75 TD -0.1594 Tc 0.4184 Tw (collide in the same set when at a distance of 16, and 8 branches collide at a distance of 2048, while ) Tj 0 -25.5 TD -0.1003 Tc 1.0378 Tw (4 branches do not co) Tj 94.5 0 TD -0.124 Tc 1.0038 Tw (llide at any distance. Hence, the conclusion is the same: the P6 architecture ) Tj -94.5 -25.5 TD -0.1315 Tc 0.319 Tw (has 4 cache ways \() Tj 83.25 0 TD -0.2706 Tc 0.4581 Tw (Figure ) Tj 30.75 0 TD -0.375 Tc 0 Tw (19) Tj 10.5 0 TD 0.0956 Tc (\).) Tj 6.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -105 -26.25 TD /F3 11.25 Tf 0.0487 Tc 0.8888 Tw (Table ) Tj 30 0 TD 0.375 Tc 0 Tw (2) Tj 6 0 TD -0.1324 Tc 0.1468 Tw ( P6 branch mispredictions when trying to map B branches in the same set.) Tj 348 0 TD 0 Tc 0.1875 Tw ( ) Tj -244.5 -34.5 TD -0.0187 Tc -0.5438 Tw (Iterations: 1M, B = 16) Tj 104.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 360.75 0.75 0.75 re f 188.25 360.75 0.75 0.75 re f 189 360.75 233.25 0.75 re f 422.25 360.75 0.75 0.75 re f 422.25 360.75 0.75 0.75 re f 188.25 342 0.75 18.75 re f 422.25 342 0.75 18.75 re f BT 198 328.5 TD -0.1866 Tc 0 Tw (Distance) Tj 41.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 41.25 0 TD -0.1247 Tc 0.3122 Tw (Mispredicted branches) Tj 108.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 341.25 0.75 0.75 re f 189 341.25 57.75 0.75 re f 246.75 341.25 0.75 0.75 re f 247.5 341.25 174.75 0.75 re f 422.25 341.25 0.75 0.75 re f 188.25 322.5 0.75 18.75 re f 248.25 322.5 0.75 18.75 re f 246.75 322.5 0.75 18.75 re f 422.25 322.5 0.75 18.75 re f BT 210.75 309.75 TD /F0 11.25 Tf -0.375 Tc 0 Tw (512) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj 95.25 0 TD -0.2625 Tc 0.075 Tw ( 1,953 ) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 321.75 0.75 0.75 re f 189 321.75 57.75 0.75 re f 248.25 321.75 0.75 0.75 re f 246.75 321.75 0.75 0.75 re f 249 321.75 173.25 0.75 re f 422.25 321.75 0.75 0.75 re f 188.25 303 0.75 18.75 re f 248.25 303 0.75 18.75 re f 246.75 303 0.75 18.75 re f 422.25 303 0.75 18.75 re f BT 208.5 290.25 TD -0.375 Tc 0 Tw (1024) Tj 21 0 TD 0 Tc 0.1875 Tw ( ) Tj 79.5 0 TD -0.1875 Tc 0 Tw ( 14,938,664 ) Tj 54.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 302.25 0.75 0.75 re f 189 302.25 57.75 0.75 re f 248.25 302.25 0.75 0.75 re f 246.75 302.25 0.75 0.75 re f 249 302.25 173.25 0.75 re f 422.25 302.25 0.75 0.75 re f 188.25 283.5 0.75 18.75 re f 248.25 283.5 0.75 18.75 re f 246.75 283.5 0.75 18.75 re f 422.25 283.5 0.75 18.75 re f BT 256.5 270 TD /F3 11.25 Tf 0.0022 Tc -0.5647 Tw (Iterations: 1M, B = 8) Tj 99 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 282.75 0.75 0.75 re f 189 282.75 57.75 0.75 re f 246.75 282.75 0.75 0.75 re f 247.5 282.75 174.75 0.75 re f 422.25 282.75 0.75 0.75 re f 188.25 264 0.75 18.75 re f 422.25 264 0.75 18.75 re f BT 198 250.5 TD -0.1866 Tc 0 Tw (Distance) Tj 41.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 41.25 0 TD -0.1247 Tc 0.3122 Tw (Mispredicted branches) Tj 108.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 263.25 0.75 0.75 re f 189 263.25 57.75 0.75 re f 246.75 263.25 0.75 0.75 re f 247.5 263.25 174.75 0.75 re f 422.25 263.25 0.75 0.75 re f 188.25 244.5 0.75 18.75 re f 248.25 244.5 0.75 18.75 re f 246.75 244.5 0.75 18.75 re f 422.25 244.5 0.75 18.75 re f BT 208.5 231.75 TD /F0 11.25 Tf -0.375 Tc 0 Tw (1024) Tj 21 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.2625 Tc 0.075 Tw ( 2,520 ) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 243.75 0.75 0.75 re f 189 243.75 57.75 0.75 re f 248.25 243.75 0.75 0.75 re f 246.75 243.75 0.75 0.75 re f 249 243.75 173.25 0.75 re f 422.25 243.75 0.75 0.75 re f 188.25 225.75 0.75 18 re f 248.25 225.75 0.75 18 re f 246.75 225.75 0.75 18 re f 422.25 225.75 0.75 18 re f BT 208.5 213 TD -0.375 Tc 0 Tw (2048) Tj 21 0 TD 0 Tc 0.1875 Tw ( ) Tj 82.5 0 TD -0.25 Tc 0.0625 Tw ( 6,927,480 ) Tj 48.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 225 0.75 0.75 re f 189 225 57.75 0.75 re f 248.25 225 0.75 0.75 re f 246.75 225 0.75 0.75 re f 249 225 173.25 0.75 re f 422.25 225 0.75 0.75 re f 188.25 206.25 0.75 18.75 re f 248.25 206.25 0.75 18.75 re f 246.75 206.25 0.75 18.75 re f 422.25 206.25 0.75 18.75 re f BT 256.5 192.75 TD /F3 11.25 Tf 0.0022 Tc -0.5647 Tw (Iterations: 1M, B = 4) Tj 99 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 205.5 0.75 0.75 re f 189 205.5 57.75 0.75 re f 246.75 205.5 0.75 0.75 re f 247.5 205.5 174.75 0.75 re f 422.25 205.5 0.75 0.75 re f 188.25 186.75 0.75 18.75 re f 422.25 186.75 0.75 18.75 re f BT 198 173.25 TD -0.1866 Tc 0 Tw (Distance) Tj 41.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 41.25 0 TD -0.1247 Tc 0.3122 Tw (Mispredicted branches) Tj 108.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 186 0.75 0.75 re f 189 186 57.75 0.75 re f 246.75 186 0.75 0.75 re f 247.5 186 174.75 0.75 re f 422.25 186 0.75 0.75 re f 188.25 167.25 0.75 18.75 re f 248.25 167.25 0.75 18.75 re f 246.75 167.25 0.75 18.75 re f 422.25 167.25 0.75 18.75 re f BT 208.5 154.5 TD /F0 11.25 Tf -0.375 Tc 0 Tw (2048) Tj 21 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.2625 Tc 0.075 Tw ( 2,400 ) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 166.5 0.75 0.75 re f 189 166.5 57.75 0.75 re f 248.25 166.5 0.75 0.75 re f 246.75 166.5 0.75 0.75 re f 249 166.5 173.25 0.75 re f 422.25 166.5 0.75 0.75 re f 188.25 147.75 0.75 18.75 re f 248.25 147.75 0.75 18.75 re f 246.75 147.75 0.75 18.75 re f 422.25 147.75 0.75 18.75 re f BT 208.5 135 TD -0.375 Tc 0 Tw (4096) Tj 21 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.2625 Tc 0.075 Tw ( 4,097 ) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 188.25 147 0.75 0.75 re f 189 147 57.75 0.75 re f 248.25 147 0.75 0.75 re f 246.75 147 0.75 0.75 re f 249 147 173.25 0.75 re f 422.25 147 0.75 0.75 re f 188.25 128.25 0.75 18.75 re f 188.25 127.5 0.75 0.75 re f 188.25 127.5 0.75 0.75 re f 189 127.5 57.75 0.75 re f 248.25 128.25 0.75 18.75 re f 246.75 128.25 0.75 18.75 re f 246.75 127.5 0.75 0.75 re f 247.5 127.5 174.75 0.75 re f 422.25 128.25 0.75 18.75 re f 422.25 127.5 0.75 0.75 re f 422.25 127.5 0.75 0.75 re f BT 99.75 118.5 TD ( ) Tj ET endstream endobj 98 0 obj 27954 endobj 96 0 obj << /Type /Page /Parent 78 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R /F4 26 0 R >> /ProcSet 2 0 R >> /Contents 97 0 R >> endobj 101 0 obj << /Length 102 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -204 672 TD /F0 11.25 Tf -0.1283 Tc 2.5658 Tw (Finally, to verify wh) Tj 96.75 0 TD -0.0954 Tc 2.3284 Tw (ether the correctness of the assumption about BTB size, the different ) Tj -108.75 -25.5 TD -0.1297 Tc 1.0672 Tw (distance experiment is performed with twice as many branches. ) Tj 291 0 TD -0.3232 Tc 0.5107 Tw (Table ) Tj 27.75 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD -0.1247 Tc 0.6872 Tw ( shows results for the P6 ) Tj -324 -24.75 TD -0.1035 Tc 1.791 Tw (architecture for 1024 branches. The distances that produced the low M) Tj 326.25 0 TD -0.1979 Tc 1.5854 Tw (PR when the number of ) Tj -326.25 -25.5 TD -0.0614 Tc 0.2489 Tw (branches was 512 now produce an MPR close to 100%. Hence, the actual number of BTB entries ) Tj 0 -25.5 TD -0.4069 Tc 0.5944 Tw (is 512.) Tj 28.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 0.75 -26.25 TD /F3 11.25 Tf 0.0487 Tc 0.8888 Tw (Table ) Tj 30 0 TD 0.375 Tc 0 Tw (3) Tj 6 0 TD -0.1432 Tc 0.2057 Tw ( P6 branch mispredictions when the total number of branches is 2* N) Tj 324 -1.5 TD /F3 6.75 Tf -0.2522 Tc 0 Tw (BTB) Tj 14.25 1.5 TD /F3 11.25 Tf 0.1875 Tc (.) Tj 3 0 TD 0 Tc 0.1875 Tw ( ) Tj -231.75 -34.5 TD 0.0072 Tc -0.5697 Tw (Iter. 1M, B = 1024) Tj 86.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 223.5 561.75 0.75 0.75 re f 223.5 561.75 0.75 0.75 re f 224.25 561.75 162 0.75 re f 386.25 561.75 0.75 0.75 re f 386.25 561.75 0.75 0.75 re f 223.5 543 0.75 18.75 re f 386.25 543 0.75 18.75 re f BT 228 529.5 TD -0.1866 Tc 0 Tw (Distance) Tj ET q 268.5 527.25 3 12.75 re h W n BT 268.5 529.5 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 276 529.5 TD -0.2912 Tc (Mis) Tj 18.75 0 TD -0.1835 Tc 0.371 Tw (predicted branches) Tj ET q 384 527.25 2.25 12.75 re h W n BT 384 529.5 TD 0 Tc 0.1875 Tw ( ) Tj ET Q 223.5 542.25 0.75 0.75 re f 224.25 542.25 47.25 0.75 re f 271.5 542.25 0.75 0.75 re f 272.25 542.25 114 0.75 re f 386.25 542.25 0.75 0.75 re f 223.5 524.25 0.75 18 re f 271.5 524.25 0.75 18 re f 386.25 524.25 0.75 18 re f BT 245.25 511.5 TD /F0 11.25 Tf 0.375 Tc 0 Tw (4) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 48 0 TD -0.1875 Tc 0 Tw (1,017,750,000) Tj 62.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 223.5 523.5 0.75 0.75 re f 224.25 523.5 47.25 0.75 re f 271.5 523.5 0.75 0.75 re f 272.25 523.5 114 0.75 re f 386.25 523.5 0.75 0.75 re f 223.5 504.75 0.75 18.75 re f 271.5 504.75 0.75 18.75 re f 386.25 504.75 0.75 18.75 re f BT 245.25 492 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 48 0 TD -0.1875 Tc 0 Tw (1,016,900,000) Tj 62.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 223.5 504 0.75 0.75 re f 224.25 504 47.25 0.75 re f 271.5 504 0.75 0.75 re f 272.25 504 114 0.75 re f 386.25 504 0.75 0.75 re f 223.5 485.25 0.75 18.75 re f 271.5 485.25 0.75 18.75 re f 386.25 485.25 0.75 18.75 re f BT 243 472.5 TD -0.375 Tc 0 Tw (16) Tj 10.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 45 0 TD -0.1875 Tc 0 Tw (1,020,700,000) Tj 62.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 223.5 484.5 0.75 0.75 re f 224.25 484.5 47.25 0.75 re f 271.5 484.5 0.75 0.75 re f 272.25 484.5 114 0.75 re f 386.25 484.5 0.75 0.75 re f 223.5 466.5 0.75 18 re f 223.5 465.75 0.75 0.75 re f 223.5 465.75 0.75 0.75 re f 224.25 465.75 47.25 0.75 re f 271.5 466.5 0.75 18 re f 271.5 465.75 0.75 0.75 re f 272.25 465.75 114 0.75 re f 386.25 466.5 0.75 18 re f 386.25 465.75 0.75 0.75 re f 386.25 465.75 0.75 0.75 re f BT 306 450.75 TD ( ) Tj 1 1 1 rg 0.75 w 1 J 1 j 0 0 0 RG ET 417.75 354.75 m 460.5 354.75 l 460.5 397.5 l 417.75 397.5 l 417.75 354.75 l h b* 407.25 344.25 m 450 344.25 l 450 387 l 407.25 387 l 407.25 344.25 l h b* 396 333.75 m 439.5 333.75 l 439.5 376.5 l 396 376.5 l 396 333.75 l h b* 346.5 344.25 m 246 344.25 l S 0 0 0 rg 345.75 346.5 m 353.25 344.25 l 345.75 342 l 345.75 346.5 l h f* BT 0.9967 0 0 1 369.75 357 Tm /F4 9.0811 Tf 0.2185 Tc 0 Tw (0) Tj -10.5352 -32.25 TD (127) Tj 1 1 1 rg ET 385.5 354.75 m 428.25 354.75 l 428.25 366 l 385.5 366 l 385.5 354.75 l h b* 385.5 344.25 m 428.25 344.25 l 428.25 354.75 l 385.5 354.75 l 385.5 344.25 l h b* 385.5 333.75 m 428.25 333.75 l 428.25 344.25 l 385.5 344.25 l 385.5 333.75 l h b* BT 0.9967 0 0 1 403.5 336 Tm 0 0 0 rg -0.267 Tc (...) Tj 1 1 1 rg ET 385.5 322.5 m 428.25 322.5 l 428.25 333.75 l 385.5 333.75 l 385.5 322.5 l h b* 278.25 406.5 m 278.25 374.25 l S BT 0.9967 0 0 1 287.25 408 Tm 0 0 0 rg 0.1934 Tc (Distance) Tj 1 1 1 rg ET 267.75 384.75 m 278.25 384.75 l 278.25 396 l 267.75 396 l 267.75 384.75 l h b* 256.5 384.75 m 267.75 384.75 l 267.75 396 l 256.5 396 l 256.5 384.75 l h b* 225 384.75 m 246 384.75 l 246 396 l 225 396 l 225 384.75 l h b* BT 0.9967 0 0 1 231.75 387.75 Tm 0 0 0 rg -0.267 Tc (...) Tj 1 1 1 rg ET 213.75 384.75 m 225 384.75 l 225 396 l 213.75 396 l 213.75 384.75 l h b* 246 384.75 m 256.5 384.75 l 256.5 396 l 246 396 l 246 384.75 l h b* 213.75 406.5 m 213.75 374.25 l S 203.25 384.75 m 213.75 384.75 l 213.75 396 l 203.25 396 l 203.25 384.75 l h b* 192.75 384.75 m 203.25 384.75 l 203.25 396 l 192.75 396 l 192.75 384.75 l h b* 160.5 384.75 m 181.5 384.75 l 181.5 396 l 160.5 396 l 160.5 384.75 l h b* BT 0.9967 0 0 1 167.25 387.75 Tm 0 0 0 rg (...) Tj 1 1 1 rg ET 181.5 384.75 m 192.75 384.75 l 192.75 396 l 181.5 396 l 181.5 384.75 l h b* 149.25 384.75 m 160.5 384.75 l 160.5 396 l 149.25 396 l 149.25 384.75 l h b* BT 0.9967 0 0 1 234.75 408 Tm 0 0 0 rg 0.0726 Tc (Index) Tj ET BT 1.0098 0 0 1 313.5 398.25 Tm /F4 7.5373 Tf 0.2632 Tc (0) Tj -10.3986 0 TD (1) Tj -21.5399 0 TD (3) Tj -10.3986 0 TD (4) Tj -54.9639 0 TD (10) Tj ET q 211.5 418.5 48 10.5 re h W n BT 0.9967 0 0 1 211.5 420.75 Tm /F4 9.0811 Tf 0.1653 Tc -0.4314 Tw (P6 Address) Tj ET Q 211.5 419.25 48 0.75 re f 1 1 1 rg 310.5 384.75 m 321 384.75 l 321 396 l 310.5 396 l 310.5 384.75 l h b* 300 384.75 m 310.5 384.75 l 310.5 396 l 300 396 l 300 384.75 l h b* 288.75 384.75 m 300 384.75 l 300 396 l 288.75 396 l 288.75 384.75 l h b* 278.25 384.75 m 288.75 384.75 l 288.75 396 l 278.25 396 l 278.25 384.75 l h b* BT 1.0098 0 0 1 292.5 398.25 Tm 0 0 0 rg (2) Tj -86.9024 0 TD (11) Tj -53.4784 0 TD (31) Tj ET 246 344.25 m 246 374.25 l S BT 0.9967 0 0 1 406.5 406.5 Tm /F4 9.0811 Tf -0.0347 Tc -0.2314 Tw (P6 BTB) Tj ET 406.5 405 30.75 0.75 re f 283.5 341.25 m 289.5 348 l S BT 1.0098 0 0 1 285 348 Tm /F4 7.5373 Tf 0.2632 Tc 0 Tw (7) Tj ET BT 462.75 313.5 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 1 1 1 rg ET 417 215.25 m 459 215.25 l 459 257.25 l 417 257.25 l 417 215.25 l h b* 406.5 204.75 m 448.5 204.75 l 448.5 246.75 l 406.5 246.75 l 406.5 204.75 l h b* 396 194.25 m 438 194.25 l 438 236.25 l 396 236.25 l 396 194.25 l h b* 347.25 204.75 m 249 204.75 l S 0 0 0 rg 346.5 207 m 354 204.75 l 346.5 202.5 l 346.5 207 l h f* BT 0.988 0 0 1 369.75 216.75 Tm /F4 8.9636 Tf 0.3298 Tc 0 Tw (0) Tj -15.1815 -31.5 TD (1023) Tj 1 1 1 rg ET 385.5 215.25 m 427.5 215.25 l 427.5 225.75 l 385.5 225.75 l 385.5 215.25 l h b* 385.5 204.75 m 427.5 204.75 l 427.5 215.25 l 385.5 215.25 l 385.5 204.75 l h b* 385.5 194.25 m 427.5 194.25 l 427.5 204.75 l 385.5 204.75 l 385.5 194.25 l h b* BT 0.988 0 0 1 402.75 196.5 Tm 0 0 0 rg -0.2146 Tc (...) Tj 1 1 1 rg ET 385.5 183 m 427.5 183 l 427.5 194.25 l 385.5 194.25 l 385.5 183 l h b* 280.5 266.25 m 280.5 234 l S BT 0.988 0 0 1 289.5 267.75 Tm 0 0 0 rg 0.2908 Tc (Distance) Tj 1 1 1 rg ET 270 244.5 m 280.5 244.5 l 280.5 255 l 270 255 l 270 244.5 l h b* 259.5 244.5 m 270 244.5 l 270 255 l 259.5 255 l 259.5 244.5 l h b* 228 244.5 m 249 244.5 l 249 255 l 228 255 l 228 244.5 l h b* BT 0.988 0 0 1 234.75 247.5 Tm 0 0 0 rg -0.2146 Tc (...) Tj 1 1 1 rg ET 217.5 244.5 m 228 244.5 l 228 255 l 217.5 255 l 217.5 244.5 l h b* 249 244.5 m 259.5 244.5 l 259.5 255 l 249 255 l 249 244.5 l h b* 217.5 266.25 m 217.5 234 l S 207 244.5 m 217.5 244.5 l 217.5 255 l 207 255 l 207 244.5 l h b* 196.5 244.5 m 207 244.5 l 207 255 l 196.5 255 l 196.5 244.5 l h b* 165 244.5 m 186 244.5 l 186 255 l 165 255 l 165 244.5 l h b* BT 0.988 0 0 1 171.75 247.5 Tm 0 0 0 rg (...) Tj 1 1 1 rg ET 186 244.5 m 196.5 244.5 l 196.5 255 l 186 255 l 186 244.5 l h b* 154.5 244.5 m 165 244.5 l 165 255 l 154.5 255 l 154.5 244.5 l h b* BT 0.988 0 0 1 237.75 267.75 Tm 0 0 0 rg 0.1695 Tc (Index) Tj ET BT 1.001 0 0 1 315 258 Tm /F4 7.4398 Tf 0.3585 Tc (0) Tj -10.4893 0 TD (1) Tj -20.9785 0 TD (3) Tj -10.4893 0 TD (4) Tj -54.694 0 TD (13) Tj ET BT 0.988 0 0 1 200.25 279.75 Tm /F4 8.9636 Tf 0.2404 Tc -0.4525 Tw (NetBurst Address) Tj ET 200.25 278.25 72.75 0.75 re f 1 1 1 rg 312 244.5 m 322.5 244.5 l 322.5 255 l 312 255 l 312 244.5 l h b* 301.5 244.5 m 312 244.5 l 312 255 l 301.5 255 l 301.5 244.5 l h b* 291 244.5 m 301.5 244.5 l 301.5 255 l 291 255 l 291 244.5 l h b* 280.5 244.5 m 291 244.5 l 291 255 l 280.5 255 l 280.5 244.5 l h b* BT 1.001 0 0 1 294 258 Tm 0 0 0 rg /F4 7.4398 Tf 0.3585 Tc 0 Tw (2) Tj -86.1618 0 TD (14) Tj -52.4463 0 TD (31) Tj ET 249 204.75 m 249 234 l S BT 0.988 0 0 1 390 265.5 Tm /F4 8.9636 Tf 0.1447 Tc -0.3568 Tw (NetBurst BTB) Tj ET 390 264 55.5 0.75 re f 285 201.75 m 291.75 207.75 l S BT 1.001 0 0 1 285 208.5 Tm /F4 7.4398 Tf 0.3585 Tc 0 Tw (10) Tj ET BT 459.75 177.75 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -282 -22.5 TD /F3 11.25 Tf -0.0619 Tc 0.2494 Tw (Figure ) Tj 34.5 0 TD -0.375 Tc 0 Tw (19) Tj 11.25 0 TD -0.1306 Tc -0.0033 Tw ( P6 and NetBurst BTB size and organization.) Tj 210.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -334.5 -30.75 TD /F0 11.25 Tf -0.0792 Tc 1.0167 Tw (The results are similar for NetBurst architecture \(N) Tj 233.25 -1.5 TD /F0 6.75 Tf -0.1262 Tc 0 Tw (BTB) Tj 12.75 0 TD 0.0023 Tc (-) Tj 2.25 0 TD -0.1886 Tc (FE) Tj 8.25 1.5 TD /F0 11.25 Tf -0.1089 Tc 1.1401 Tw (=4096\); i.e., the MPR is close to 0% ) Tj -268.5 -25.5 TD -0.099 Tc 0.2865 Tw (when t) Tj 30 0 TD -0.139 Tc 0.4515 Tw (he distance between addresses of subsequent branches is 4, 8, or 16; and it is close to 100% ) Tj ET endstream endobj 102 0 obj 10268 endobj 99 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R /F4 26 0 R >> /ProcSet 2 0 R >> /Contents 101 0 R >> endobj 104 0 obj << /Length 105 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.0922 Tc 1.0297 Tw (for other distances. Therefore, the front) Tj 178.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0535 Tc 0.991 Tw (end BTB has 4 ways and 1024 sets, while bits 4) Tj 221.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD 0.1028 Tc 0.4597 Tw (13 are ) Tj -407.25 -25.5 TD -0.1796 Tc 0.3671 Tw (used as the set index \() Tj 96.75 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 31.5 0 TD -0.375 Tc 0 Tw (19) Tj 10.5 0 TD 0.0956 Tc (\).) Tj 6.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -145.5 -38.25 TD /F1 12 Tf 0.048 Tc -0.048 Tw (Outcome ) Tj 48 0 TD 0.0308 Tc -0.0308 Tw (Predictor Results ) Tj 88.5 0 TD 0 Tc 0 Tw (\226) Tj 6 0 TD 0.0296 Tc -0.4046 Tw ( P6 Architecture) Tj 81 0 TD 0 Tc 0 Tw ( ) Tj -223.5 -20.25 TD /F0 12 Tf ( ) Tj 12 -12 TD /F3 11.25 Tf 0.1769 Tc -0.7394 Tw (Step 1.) Tj 32.25 0 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 3 0 TD -0.1732 Tc 0.3607 Tw (Table ) Tj 27.75 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.1348 Tc 0.3223 Tw ( shows the results of the Step 1 experiment \() Tj 196.5 0 TD -0.1456 Tc 0.3331 Tw (Figure ) Tj 31.5 0 TD -0.375 Tc 0 Tw (11) Tj 10.5 0 TD -0.1089 Tc 0.5964 Tw (\). The maximum length of ) Tj -318.75 -25.5 TD -0.1211 Tc 1.1086 Tw (a correctly predicted pattern is 5, since the spy branch with a pattern of length 6) Tj 363.75 0 TD -0.1337 Tc 1.3212 Tw ( is mispredicted ) Tj -363.75 -25.5 TD -0.0879 Tc 1.1661 Tw (once in each 6 times \(10,000,000/6 = 1,666,666\), which is close to the number of mispredicted ) Tj 0 -24.75 TD -0.1572 Tc 0.5947 Tw (branches shown in ) Tj 87 0 TD -0.1732 Tc 0.3607 Tw (Table ) Tj 28.5 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.1448 Tc 0.9752 Tw (. This result can be caused by a local predictor component that uses 4 ) Tj -120.75 -25.5 TD -0.2013 Tc 0.5388 Tw (bits of local history, or a) Tj 107.25 0 TD -0.1533 Tc 0.3408 Tw ( global component that uses 8 global history bits.) Tj 216 0 TD 0 Tc 0.1875 Tw ( ) Tj -201.75 -26.25 TD /F3 11.25 Tf 0.0487 Tc 0.8888 Tw (Table ) Tj 30 0 TD 0.375 Tc 0 Tw (4) Tj 6 0 TD -0.0862 Tc -0.1013 Tw ( Results of the Step 1 experiment.) Tj 156.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -213 -35.25 TD ( ) Tj 65.25 0 TD -0.2494 Tc 0 Tw (P6) Tj 12.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 321 450 3.75 12.75 re h W n BT 321 452.25 TD ( ) Tj ET Q BT 360 452.25 TD 0.0638 Tc 0 Tw (NetBurst) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 465 0.75 0.75 re f 173.25 465 0.75 0.75 re f 174 465 30 0.75 re f 204 465 0.75 0.75 re f 204.75 465 111.75 0.75 re f 316.5 465 0.75 0.75 re f 317.25 465 7.5 0.75 re f 324.75 465 0.75 0.75 re f 325.5 465 111 0.75 re f 436.5 465 0.75 0.75 re f 436.5 465 0.75 0.75 re f 173.25 447 0.75 18 re f 204 447 0.75 18 re f 316.5 447 0.75 18 re f 324.75 447 0.75 18 re f 436.5 447 0.75 18 re f BT 178.5 433.5 TD -0.135 Tc 0 Tw (Iter.) Tj ET q 199.5 431.25 4.5 12.75 re h W n BT 199.5 433.5 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 208.5 433.5 TD -0.1409 Tc -1.1716 Tw (Pattern ) Tj 2.25 -12.75 TD -0 Tc 0 Tw (length) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj 10.5 12.75 TD 0.0631 Tc 0 Tw (Mispredicted) Tj ET q 313.5 431.25 3 12.75 re h W n BT 313.5 433.5 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 260.25 420.75 TD -0.1252 Tc (branches) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 321 431.25 3.75 12.75 re h W n BT 321 433.5 TD ( ) Tj ET Q BT 329.25 433.5 TD -0.1409 Tc -1.1716 Tw (Pattern ) Tj 2.25 -12.75 TD -0 Tc 0 Tw (length) Tj 30 0 TD 0 Tc 0.1875 Tw ( ) Tj 9.75 12.75 TD 0.0631 Tc 0 Tw (Mispredicted) Tj ET q 433.5 431.25 3 12.75 re h W n BT 433.5 433.5 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 380.25 420.75 TD -0.1252 Tc (branches) Tj 43.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 446.25 0.75 0.75 re f 174 446.25 30 0.75 re f 204 446.25 0.75 0.75 re f 204.75 446.25 42 0.75 re f 246.75 446.25 0.75 0.75 re f 247.5 446.25 69 0.75 re f 316.5 446.25 0.75 0.75 re f 324.75 446.25 0.75 0.75 re f 325.5 446.25 42 0.75 re f 367.5 446.25 0.75 0.75 re f 368.25 446.25 68.25 0.75 re f 436.5 446.25 0.75 0.75 re f 173.25 414.75 0.75 31.5 re f 204 414.75 0.75 31.5 re f 246.75 414.75 0.75 31.5 re f 316.5 414.75 0.75 31.5 re f 324.75 414.75 0.75 31.5 re f 367.5 414.75 0.75 31.5 re f 436.5 414.75 0.75 31.5 re f BT 177.75 402 TD /F0 11.25 Tf -0.0838 Tc 0.2713 Tw (10 M) Tj ET q 201 399 3 12.75 re h W n BT 201 402 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 223.5 402 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45.75 0 TD -0.375 Tc -0.1875 Tw (420 ) Tj 18.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 321 399 3.75 12.75 re h W n BT 321 402 TD ( ) Tj ET Q BT 344.25 402 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45 0 TD -0.375 Tc 0 Tw (987) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 414 0.75 0.75 re f 174 414 30 0.75 re f 204 414 0.75 0.75 re f 204.75 414 42 0.75 re f 246.75 414 0.75 0.75 re f 247.5 414 69 0.75 re f 316.5 414 0.75 0.75 re f 324.75 414 0.75 0.75 re f 325.5 414 42 0.75 re f 367.5 414 0.75 0.75 re f 368.25 414 68.25 0.75 re f 436.5 414 0.75 0.75 re f 173.25 396 0.75 18 re f 204 396 0.75 18 re f 246.75 396 0.75 18 re f 316.5 396 0.75 18 re f 324.75 396 0.75 18 re f 367.5 396 0.75 18 re f 436.5 396 0.75 18 re f BT 189 383.25 TD ( ) Tj 34.5 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45.75 0 TD -0.375 Tc 0 Tw (432) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 321 380.25 3.75 12.75 re h W n BT 321 383.25 TD ( ) Tj ET Q BT 344.25 383.25 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45 0 TD -0.375 Tc 0 Tw (973) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 395.25 0.75 0.75 re f 174 395.25 30 0.75 re f 204 395.25 0.75 0.75 re f 204.75 395.25 42 0.75 re f 246.75 395.25 0.75 0.75 re f 247.5 395.25 69 0.75 re f 316.5 395.25 0.75 0.75 re f 324.75 395.25 0.75 0.75 re f 325.5 395.25 42 0.75 re f 367.5 395.25 0.75 0.75 re f 368.25 395.25 68.25 0.75 re f 436.5 395.25 0.75 0.75 re f 173.25 376.5 0.75 18.75 re f 204 376.5 0.75 18.75 re f 246.75 376.5 0.75 18.75 re f 316.5 376.5 0.75 18.75 re f 324.75 376.5 0.75 18.75 re f 367.5 376.5 0.75 18.75 re f 436.5 376.5 0.75 18.75 re f BT 189 363.75 TD ( ) Tj 34.5 0 TD 0.375 Tc 0 Tw (6) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 31.5 0 TD -0.4688 Tc 0 Tw (1,) Tj 7.5 0 TD -0.0804 Tc -0.4821 Tw (545,480 ) Tj 39 0 TD 0 Tc 0.1875 Tw ( ) Tj ET q 321 360.75 3.75 12.75 re h W n BT 321 363.75 TD ( ) Tj ET Q BT 344.25 363.75 TD 0.375 Tc 0 Tw (7) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45 0 TD -0.375 Tc 0 Tw (957) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 375.75 0.75 0.75 re f 174 375.75 30 0.75 re f 204 375.75 0.75 0.75 re f 204.75 375.75 42 0.75 re f 246.75 375.75 0.75 0.75 re f 247.5 375.75 69 0.75 re f 316.5 375.75 0.75 0.75 re f 324.75 375.75 0.75 0.75 re f 325.5 375.75 42 0.75 re f 367.5 375.75 0.75 0.75 re f 368.25 375.75 68.25 0.75 re f 436.5 375.75 0.75 0.75 re f 173.25 357.75 0.75 18 re f 204 357.75 0.75 18 re f 246.75 357.75 0.75 18 re f 316.5 357.75 0.75 18 re f 324.75 357.75 0.75 18 re f 367.5 357.75 0.75 18 re f 436.5 357.75 0.75 18 re f BT 189 345 TD ( ) Tj 36.75 0 TD ( ) Tj 56.25 0 TD ( ) Tj ET q 321 342 3.75 12.75 re h W n BT 321 345 TD ( ) Tj ET Q BT 344.25 345 TD 0.375 Tc 0 Tw (8) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 40.5 0 TD -0.2625 Tc 0 Tw (1,256) Tj 24 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 173.25 357 0.75 0.75 re f 173.25 357 0.75 0.75 re f 174 357 30 0.75 re f 204 357 0.75 0.75 re f 204.75 357 42 0.75 re f 246.75 357 0.75 0.75 re f 247.5 357 69 0.75 re f 316.5 357 0.75 0.75 re f 316.5 357 0.75 0.75 re f 324.75 357 0.75 0.75 re f 325.5 357 42 0.75 re f 367.5 357 0.75 0.75 re f 368.25 357 68.25 0.75 re f 436.5 357 0.75 0.75 re f 324.75 338.25 0.75 18.75 re f 367.5 338.25 0.75 18.75 re f 436.5 338.25 0.75 18.75 re f BT 189 325.5 TD ( ) Tj 36.75 0 TD ( ) Tj 56.25 0 TD ( ) Tj ET q 321 322.5 3.75 12.75 re h W n BT 321 325.5 TD ( ) Tj ET Q BT 344.25 325.5 TD 0.375 Tc 0 Tw (9) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 45 0 TD -0.375 Tc 0 Tw (918) Tj 15.75 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 324.75 337.5 0.75 0.75 re f 325.5 337.5 42 0.75 re f 367.5 337.5 0.75 0.75 re f 368.25 337.5 68.25 0.75 re f 436.5 337.5 0.75 0.75 re f 324.75 319.5 0.75 18 re f 367.5 319.5 0.75 18 re f 436.5 319.5 0.75 18 re f BT 189 306.75 TD ( ) Tj 36.75 0 TD ( ) Tj 56.25 0 TD ( ) Tj ET q 321 303.75 3.75 12.75 re h W n BT 321 306.75 TD ( ) Tj ET Q BT 341.25 306.75 TD -0.375 Tc 0 Tw (10) Tj 10.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 33 0 TD -0.0804 Tc 0 Tw (964,830) Tj 35.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 324.75 318.75 0.75 0.75 re f 325.5 318.75 42 0.75 re f 367.5 318.75 0.75 0.75 re f 368.25 318.75 68.25 0.75 re f 436.5 318.75 0.75 0.75 re f 324.75 300.75 0.75 18 re f 324.75 300 0.75 0.75 re f 324.75 300 0.75 0.75 re f 325.5 300 42 0.75 re f 367.5 300.75 0.75 18 re f 367.5 300 0.75 0.75 re f 368.25 300 68.25 0.75 re f 436.5 300.75 0.75 18 re f 436.5 300 0.75 0.75 re f 436.5 300 0.75 0.75 re f BT 99.75 291 TD ( ) Tj 0 -25.5 TD /F3 11.25 Tf 0.1769 Tc -0.7394 Tw (Step 2.) Tj 33 0 TD /F0 11.25 Tf -0.1384 Tc 0.3259 Tw ( The microbenchmark has eight \223dummy\224 conditional branches before the \223spy\224 branch. ) Tj -45 -25.5 TD -0.0349 Tc 0.2619 Tw (Since the MPR is still close to 0 for longer global history pattern, the P6 architecture uses a local ) Tj 0 -25.5 TD -0.1972 Tc 0 Tw (branc) Tj 24.75 0 TD -0.2256 Tc 0.5631 Tw (h history of length 4. ) Tj 94.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -107.25 -24.75 TD /F3 11.25 Tf 0.1769 Tc -0.7394 Tw (Step 3.) Tj 33 0 TD /F0 11.25 Tf -0.109 Tc 0.4465 Tw ( The microbenchmark has three conditional branches in a loop, where the first two have ) Tj -45 -25.5 TD -0.1276 Tc 0.3592 Tw (patterns 11...10 of length 5 and 2, and hence are predictable by the local predictor component. The ) Tj 0 -25.5 TD -0.1941 Tc 1.2566 Tw (outcome of the third branch is correla) Tj 171 0 TD -0.0913 Tc 0.9663 Tw (ted with the previous two. Since it has a pattern 11...10 of ) Tj -171 -25.5 TD -0.1351 Tc 0.4805 Tw (length 10, it is not predictable by the local component with 4 history bits. The MPR is about 10%, ) Tj 0 -24.75 TD -0.142 Tc 1.1211 Tw (which means that the third branch is mispredicted once in each 10 times, when its outcome is 0. ) Tj ET endstream endobj 105 0 obj 10402 endobj 103 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F1 8 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 104 0 R >> endobj 107 0 obj << /Length 108 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.1582 Tc 0.9707 Tw (Hence, the P6 architecture does not use a global history pattern of length greater than or equal to ) Tj 0 -25.5 TD -0.0469 Tc 0.2344 Tw (two. ) Tj 22.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -10.5 -24.75 TD /F3 11.25 Tf -0.1981 Tc 0.3856 Tw (Step 4.) Tj 32.25 0 TD /F0 11.25 Tf -0.1567 Tc 0.3442 Tw ( The Step 4 experiment is a 10 million iteration loop, with two conditional branches. The ) Tj -44.25 -25.5 TD -0.1823 Tc 0.9835 Tw (first branch has a pattern 111110 of length 6, so it ) Tj 228.75 0 TD -0.1612 Tc 1.0049 Tw (is not predictable by the local component, and ) Tj -228.75 -25.5 TD -0.0658 Tc 0.42 Tw (the second branch is correlated with it by having the same outcome. The result is about 3 million ) Tj 0 -25.5 TD -0.1203 Tc 0.2502 Tw (mispredicted branches, so both conditional branches are mispredicted once in six times. Therefore, ) Tj 0 -24.75 TD -0.0116 Tc -0.1759 Tw (the P6 archi) Tj 52.5 0 TD -0.1496 Tc 0.3371 Tw (tecture does not include global prediction component.) Tj 236.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -288.75 -39 TD /F1 12 Tf 0.036 Tc -0.036 Tw (Outcome Predictor Results ) Tj 136.5 0 TD -0.246 Tc 0 Tw (-) Tj 3.75 0 TD -0.009 Tc 0.009 Tw ( NetBurst Architecture) Tj 112.5 0 TD 0 Tc 0 Tw ( ) Tj -252.75 -20.25 TD /F0 12 Tf ( ) Tj 12 -12 TD /F3 11.25 Tf 0.1769 Tc -0.7394 Tw (Step 1.) Tj 33 0 TD /F0 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj 3 0 TD -0.1732 Tc 0.3607 Tw (Table ) Tj 27.75 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD -0.0443 Tc 0.2818 Tw ( shows the results of the Step 1 experiment: the maximum length of a correctly ) Tj -81 -25.5 TD -0.0657 Tc 1.0032 Tw (predicted pattern is 9, since the \223) Tj 150 0 TD -0.141 Tc 1.0785 Tw (spy\224 branch with a pattern of length 10 is mispredicted once in ) Tj -150 -24.75 TD -0.0672 Tc 0.2547 Tw (each 10 times ) Tj 64.5 0 TD 0.0038 Tc 0 Tw (--) Tj 7.5 0 TD -0.0656 Tc 0.4138 Tw ( about 1 million of mispredictions. These results can be explained by either an 8) Tj 360 0 TD 0.0038 Tc 0 Tw (-) Tj -432 -25.5 TD -0.1556 Tc 0.4681 Tw (bit local history register or a 16) Tj 138.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1423 Tc 0.3298 Tw (bit global history register. ) Tj 116.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -246.75 -25.5 TD /F3 11.25 Tf 0.1769 Tc 0.0106 Tw (Step 2. ) Tj 37.5 0 TD /F0 11.25 Tf -0.2023 Tc 1.1398 Tw (The microbenchmark ha) Tj 108.75 0 TD -0.1001 Tc 0.9694 Tw (s 16 \223dummy\224 branches before the \223spy\224 branch with a local ) Tj -158.25 -25.5 TD -0.1004 Tc 0.5968 Tw (pattern of length 9. The measured MPR is about 10%; i.e., the \223spy\224 branch is mispredicted once ) Tj 0 -24.75 TD -0.1397 Tc 0.4062 Tw (in 9 times. Therefore, the Step 1 result is caused by a global component that uses 16 global history ) Tj 0 -25.5 TD -0.2138 Tc 0 Tw (bits.) Tj 18 0 TD /F3 11.25 Tf 0 Tc 0.1875 Tw ( ) Tj -6 -25.5 TD 0.1769 Tc -0.3644 Tw (Step 6. ) Tj 35.25 0 TD /F0 11.25 Tf -0.0871 Tc 0.2746 Tw (After several runs of different Step 6 experiments, the first conclusion might be that the ) Tj -47.25 -24.75 TD -0.0961 Tc 0.3774 Tw (NetBurst architecture uses one local history bit for prediction, since a pattern length 2 is predicted ) Tj 0 -25.5 TD -0.0979 Tc 0.2854 Tw (correctly \() Tj 45.75 0 TD -0.1732 Tc 0.3607 Tw (Table ) Tj 27.75 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD 0.1063 Tc 0.0812 Tw (\). Because) Tj 48 0 TD -0.0905 Tc 0.4144 Tw ( this architecture includes the trace cache, an additional experiment is ) Tj -126.75 -25.5 TD -0.142 Tc 0.4232 Tw (needed, with the structure from the Step 6 experiment repeated 10 times in sequence: 16 \223dummy\224 ) Tj 0 -25.5 TD -0.0473 Tc 0.3348 Tw (branches, and one \223spy\224 branch with a local history pattern of length 2. The \223spy\224 bran) Tj 392.25 0 TD 0.0023 Tc 0.1852 Tw (ches have ) Tj -392.25 -24.75 TD -0.1114 Tc 0.4395 Tw (an MPR of about 50%, which is expected for the outcome predictor without any local component. ) Tj 0 -25.5 TD -0.1408 Tc 1.1465 Tw (Hence, the low MPR in Step 6 with pattern length 2 is due to the trace cache, since it is able to ) Tj T* -0.0778 Tc 1.9528 Tw (store the sequence \223loop, 16 dummy branches, spy taken) Tj 266.25 0 TD -0.0556 Tc 1.7431 Tw (, loop, 16 dummy branches, spy not ) Tj -266.25 -25.5 TD -0.0565 Tc -0.131 Tw (taken\224 as one continuous trace.) Tj 138 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 108 0 obj 4565 endobj 106 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F1 8 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 107 0 R >> endobj 110 0 obj << /Length 111 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -94.5 671.25 TD /F3 11.25 Tf 0.0487 Tc 0.8888 Tw (Table ) Tj 30 0 TD 0.375 Tc 0 Tw (5) Tj 6 0 TD -0.0967 Tc -0.4658 Tw ( Results of ) Tj 51 0 TD 0.0013 Tc 0.9362 Tw (the ) Tj 18 0 TD -0.0968 Tc -0.0907 Tw (Step 6 experiment.) Tj 87.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -211.5 -35.25 TD -0.135 Tc 0 Tw (Iter.) Tj ET q 211.5 672.75 4.5 12.75 re h W n BT 211.5 675 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 220.5 675 TD -0.0762 Tc 0.2637 Tw (Pattern length) Tj ET q 288 672.75 3 12.75 re h W n BT 288 675 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 295.5 675 TD -0.0978 Tc -0.0897 Tw (Mispredicted spy branches) Tj ET q 422.25 672.75 2.25 12.75 re h W n BT 422.25 675 TD 0 Tc 0.1875 Tw ( ) Tj ET Q 185.25 687.75 0.75 0.75 re f 185.25 687.75 0.75 0.75 re f 186 687.75 30 0.75 re f 216 687.75 0.75 0.75 re f 216.75 687.75 74.25 0.75 re f 291 687.75 0.75 0.75 re f 291.75 687.75 132.75 0.75 re f 424.5 687.75 0.75 0.75 re f 424.5 687.75 0.75 0.75 re f 185.25 669.75 0.75 18 re f 216 669.75 0.75 18 re f 291 669.75 0.75 18 re f 424.5 669.75 0.75 18 re f BT 189.75 657 TD /F0 11.25 Tf -0.0838 Tc 0.2713 Tw (10 M) Tj ET q 213 654 3 12.75 re h W n BT 213 657 TD 0 Tc 0.1875 Tw ( ) Tj ET Q BT 251.25 657 TD 0.375 Tc 0 Tw (2) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 94.5 0 TD -0.3731 Tc 0 Tw (0%) Tj 15 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 185.25 669 0.75 0.75 re f 186 669 30 0.75 re f 216 669 0.75 0.75 re f 216.75 669 74.25 0.75 re f 291 669 0.75 0.75 re f 291.75 669 132.75 0.75 re f 424.5 669 0.75 0.75 re f 185.25 650.25 0.75 18.75 re f 216 650.25 0.75 18.75 re f 291 650.25 0.75 18.75 re f 424.5 650.25 0.75 18.75 re f BT 201 637.5 TD ( ) Tj 50.25 0 TD 0.375 Tc 0 Tw (3) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.3737 Tc 0 Tw (33%) Tj 20.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 185.25 649.5 0.75 0.75 re f 186 649.5 30 0.75 re f 216 649.5 0.75 0.75 re f 216.75 649.5 74.25 0.75 re f 291 649.5 0.75 0.75 re f 291.75 649.5 132.75 0.75 re f 424.5 649.5 0.75 0.75 re f 185.25 631.5 0.75 18 re f 216 631.5 0.75 18 re f 291 631.5 0.75 18 re f 424.5 631.5 0.75 18 re f BT 201 618.75 TD ( ) Tj 50.25 0 TD 0.375 Tc 0 Tw (4) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.3737 Tc 0 Tw (25%) Tj 20.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 185.25 630.75 0.75 0.75 re f 186 630.75 30 0.75 re f 216 630.75 0.75 0.75 re f 216.75 630.75 74.25 0.75 re f 291 630.75 0.75 0.75 re f 291.75 630.75 132.75 0.75 re f 424.5 630.75 0.75 0.75 re f 185.25 612 0.75 18.75 re f 216 612 0.75 18.75 re f 291 612 0.75 18.75 re f 424.5 612 0.75 18.75 re f BT 201 599.25 TD ( ) Tj 50.25 0 TD 0.375 Tc 0 Tw (5) Tj 5.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 92.25 0 TD -0.3737 Tc 0 Tw (20%) Tj 20.25 0 TD 0 Tc 0.1875 Tw ( ) Tj ET 185.25 611.25 0.75 0.75 re f 186 611.25 30 0.75 re f 216 611.25 0.75 0.75 re f 216.75 611.25 74.25 0.75 re f 291 611.25 0.75 0.75 re f 291.75 611.25 132.75 0.75 re f 424.5 611.25 0.75 0.75 re f 185.25 593.25 0.75 18 re f 185.25 592.5 0.75 0.75 re f 185.25 592.5 0.75 0.75 re f 186 592.5 30 0.75 re f 216 593.25 0.75 18 re f 216 592.5 0.75 0.75 re f 216.75 592.5 74.25 0.75 re f 291 593.25 0.75 18 re f 291 592.5 0.75 0.75 re f 291.75 592.5 132.75 0.75 re f 424.5 593.25 0.75 18 re f 424.5 592.5 0.75 0.75 re f 424.5 592.5 0.75 0.75 re f BT 87.75 583.5 TD ( ) Tj 3 0 TD ( ) Tj -3 -40.5 TD /F3 14.25 Tf 0.0149 Tc 0 Tw (Conclusion) Tj 67.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -55.5 -33 TD /F0 11.25 Tf -0.1939 Tc 1.3189 Tw (The continual growth in complex) Tj 149.25 0 TD -0.1634 Tc 0.9759 Tw (ity of processor features, such as wide) Tj 171.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1307 Tc 0.8182 Tw (issue, deep pipelining, ) Tj -336.75 -25.5 TD -0.1328 Tc 0.4274 Tw (branch predictor, multiple levels of cache hierarchy, etc., puts more demand on code optimizations ) Tj 0 -24.75 TD -0.1572 Tc 1.0947 Tw (to achieve optimal performance. While current compilers depend on a programmer to specify for) Tj 436.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -436.5 -25.5 TD -0.0989 Tc 0.2864 Tw (which architecture to optimize the code, and to manually adjust the code to a specific architecture, ) Tj 0 -25.5 TD -0.0725 Tc 4.01 Tw (future compilers should be more architecture) Tj 219 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0804 Tc 3.6429 Tw (aware and be able to discover the relevant ) Tj -222.75 -25.5 TD -0.0962 Tc 0.2837 Tw (characteristics of underlying architecture without a programmer\222s ) Tj 294.75 0 TD -0.1025 Tc 0.29 Tw (input. Consequently, the burden ) Tj -294.75 -24.75 TD -0.1102 Tc 0.2977 Tw (of optimization for different architectures will shift from a program developer to the compiler, and ) Tj 0 -25.5 TD -0.1344 Tc 1.0094 Tw (optimization will become more automated. Unfortunately, not all architecture details are publicly ) Tj T* -0.1502 Tc 1.3377 Tw (available, so the optimiz) Tj 110.25 0 TD -0.106 Tc 1.1185 Tw (ation process cannot rely solely on information given in manufacturers\222 ) Tj -110.25 -25.5 TD -0.0731 Tc 0.2606 Tw (manuals. To determine architecture intricacies, an architecture) Tj 277.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0522 Tc 0.3469 Tw (aware compiler should run a set of ) Tj -281.25 -24.75 TD -0.1216 Tc 0.3091 Tw (carefully tuned microbenchmarks. ) Tj 156.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -144.75 -25.5 TD -0.1122 Tc 1.9069 Tw (This paper presents a systematic approach to uncov) Tj 239.25 0 TD -0.0644 Tc 1.6269 Tw (ering the basic characteristics of branch ) Tj -251.25 -25.5 TD -0.1388 Tc 1.6763 Tw (predictors. The proposed experiment flow encompasses microbenchmarks aimed at determining ) Tj 0 -24.75 TD -0.0701 Tc 0.2576 Tw (relevant branch predictor parameters ) Tj 166.5 0 TD 0.0038 Tc 0 Tw (--) Tj 7.5 0 TD -0.0699 Tc 0.3407 Tw ( namely, branch target buffer associativity and address bits ) Tj -174 -25.5 TD -0.1401 Tc 1.8276 Tw (used as index, the exist) Tj 108 0 TD -0.1118 Tc 1.7993 Tw (ence of local and global branch history component, and the number of ) Tj -108 -25.5 TD -0.0579 Tc 4.0579 Tw (corresponding history bits. These parameters can be used for automatic or manual code) Tj 0 Tc 0.9375 Tw ( ) Tj 0 -25.5 TD -0.0933 Tc 2.3577 Tw (optimization. The proposed experiments can also be applied during the verification phase of ) Tj 0 -24.75 TD -0.1233 Tc 0 Tw (processor) Tj 42.75 0 TD -0.1345 Tc 0.3689 Tw ( design, and used as a starting point for comparison in future predictor research. Last, the ) Tj ET endstream endobj 111 0 obj 6588 endobj 109 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F3 14 0 R >> /ProcSet 2 0 R >> /Contents 110 0 R >> endobj 113 0 obj << /Length 114 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 672 TD /F0 11.25 Tf -0.1166 Tc 5.8041 Tw (experiments have educational value, providing better understanding of branch predictor) Tj 0 Tc 0.9375 Tw ( ) Tj 0 -25.5 TD -0.0944 Tc 4.1069 Tw (mechanisms. Although the proposed approach is demonstrated for Intel P6 and) Tj 0 Tc -0.5625 Tw ( ) Tj 396 0 TD -0.0155 Tc 0.203 Tw (NetBurst ) Tj -396 -24.75 TD -0.1299 Tc 0.3174 Tw (architectures, with minor modifications, it can also be used for other architectures. ) Tj 366 0 TD 0 Tc 0.1875 Tw ( ) Tj -366 -39 TD /F1 12 Tf 0.0992 Tc 0 Tw (Acknowledgments) Tj 90.75 0 TD 0 Tc ( ) Tj -78.75 -18.75 TD /F0 11.25 Tf -0.0582 Tc 2.4421 Tw (The authors are grateful to the anonymous referees for their insights and suggestions for ) Tj -12 -24.75 TD -0.1788 Tc 0.3663 Tw (strengthening this paper. This work has been partially) Tj 235.5 0 TD -0.1536 Tc 0.3411 Tw ( supported by the SED of the AMCOM.) Tj 177.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -413.25 -39 TD /F1 12 Tf -0.0648 Tc 0.0648 Tw (References ) Tj 57 0 TD 0 Tc 0 Tw ( ) Tj -57 -18.75 TD /F0 11.25 Tf -0.4688 Tc (1.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD /F2 11.25 Tf -0.06 Tc 0 Tw (IA) Tj 10.5 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.099 Tc 17.2865 Tw (32 Intel\256 Architecture Optimization) Tj 0 Tc 0.1875 Tw ( ) Tj 231.75 0 TD 0.375 Tc 0 Tw (\226) Tj 5.25 0 TD -0.0317 Tc 17.0942 Tw ( Reference Manual) Tj 118.5 0 TD /F0 11.25 Tf -0.2138 Tc 17.6512 Tw (, Intel,) Tj 0 Tc 0.1875 Tw ( ) Tj -369.75 -19.5 TD 0 0 1 rg -0.1083 Tc 0 Tw (http://www.intel.com/design/pentium4/manuals/248966.htm) Tj ET 109.5 498.75 265.5 0.75 re f BT 375 501 TD 0 0 0 rg -0.119 Tc 0.3065 Tw ( [July 2003].) Tj 56.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -343.5 -18.75 TD -0.4688 Tc 0 Tw (2.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.1444 Tc 0.2486 Tw (Coleman CL, Davidson JW. Automatic memory hierarchy characterization. In ) Tj 348 0 TD /F2 11.25 Tf -0.0378 Tc -0.1497 Tw (Proceedings of ) Tj -348 -18.75 TD -0.0666 Tc -0.4959 Tw (the ISPASS 2001) Tj 75 0 TD /F0 11.25 Tf -0.3131 Tc 0.1256 Tw (; 103 ) Tj 24.75 0 TD 0.375 Tc 0 Tw (\226) Tj 6 0 TD -0.4219 Tc (110.) Tj 18.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -146.25 -19.5 TD -0.4688 Tc 0 Tw (3.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD 0.0023 Tc 0 Tw (Saavedra) Tj 41.25 0 TD 0.0038 Tc (-) Tj 3 0 TD -0.0959 Tc 2.1584 Tw (Barrera R. CPU Performance Evaluation and Execution Time Prediction Using ) Tj -44.25 -18.75 TD -0.1493 Tc 0.3368 Tw (Narrow Spectrum Benchmarking. ) Tj 151.5 0 TD /F2 11.25 Tf -0.0696 Tc 0.2571 Tw (PhD Thesis) Tj 51.75 0 TD /F0 11.25 Tf -0.1995 Tc 0.387 Tw (, U.C. Berkeley, C) Tj 81 0 TD -0.1686 Tc 0.1061 Tw (omputer Science Div., 1992.) Tj 125.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -431.25 -18.75 TD -0.4688 Tc 0 Tw (4.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.0827 Tc 4.7702 Tw (Hennessy J, Patterson D.) Tj 0 Tc 0.1875 Tw ( ) Tj 132 0 TD /F2 11.25 Tf -0.0947 Tc 4.5947 Tw (Computer Architecture: A Quantitative Approach) Tj 238.5 0 TD /F0 11.25 Tf -0.24 Tc 4.9275 Tw (. Morgan) Tj 0 Tc -1.3125 Tw ( ) Tj -370.5 -19.5 TD -0.1767 Tc 0.3642 Tw (Kaufmann Publishers, San Mateo, CA, 2003.) Tj 198.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -220.5 -18.75 TD -0.4688 Tc 0 Tw (5.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.1004 Tc 3.0152 Tw (Hinton G et al. The Microarchitecture of the Pentium\256 4 Processor. ) Tj 338.25 0 TD /F2 11.25 Tf -0.1742 Tc 3.3617 Tw (Intel Technology) Tj 0 Tc -0.5625 Tw ( ) Tj -338.25 -18.75 TD 0.143 Tc 0 Tw (Journal) Tj 35.25 0 TD /F0 11.25 Tf -0.0612 Tc 0.2487 Tw (, \(1) Tj 14.25 5.25 TD /F0 6.75 Tf -0.3761 Tc 0 Tw (st) Tj 4.5 -5.25 TD /F0 11.25 Tf -0.1586 Tc -0.0289 Tw ( quarter ) Tj 36.75 0 TD -0.3431 Tc -0.2194 Tw (2001\), ) Tj 30 0 TD 0 0 1 rg -0.067 Tc 0 Tw (http://www.intel.com/technology/itj/q12001.htm) Tj ET 230.25 347.25 214.5 0.75 re f BT 444.75 349.5 TD 0 0 0 rg -0.119 Tc 0.3065 Tw ( [July 2003].) Tj 56.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -413.25 -18.75 TD -0.4688 Tc 0 Tw (6.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.1182 Tc 4.7223 Tw (Sprangle E, Carmean D. Increasing Processor Performance by Implementing Deeper) Tj 0 Tc -0.5625 Tw ( ) Tj 0 -19.5 TD -0.0781 Tc -0.1094 Tw (Pipelines. In ) Tj 58.5 0 TD /F2 11.25 Tf -0.1037 Tc -0.0838 Tw (Proceedings of the 29th ISCA) Tj 130.5 0 TD /F0 11.25 Tf -0.4375 Tc 0.625 Tw (, 20) Tj 15.75 0 TD -0.0255 Tc 0.213 Tw (02; 25) Tj 27.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.4375 Tc (34.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -271.5 -18.75 TD -0.4688 Tc 0 Tw (7.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD /F2 11.25 Tf -0.15 Tc 5.5875 Tw (Intel VTune\231 Performance Analyzer) Tj 180.75 0 TD /F0 11.25 Tf -0.5625 Tc 0 Tw (, ) Tj 10.5 0 TD 0 0 1 rg -0.0948 Tc (www.intel.com/software/products/vtune/) Tj ET 300.75 290.25 180 0.75 re f BT 480.75 292.5 TD 0 0 0 rg -0.1425 Tc 5.58 Tw ( [August) Tj 0 Tc 0.1875 Tw ( ) Tj -371.25 -18.75 TD -0.2181 Tc 0.4056 Tw (2002]. ) Tj 30.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -52.5 -19.5 TD -0.4688 Tc 0 Tw (8.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.0535 Tc 2.491 Tw (London K et al. End) Tj 99.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.0761 Tc 2.1386 Tw (user Tools for Application Performance Analysis Using Hardware ) Tj -103.5 -18.75 TD -0.0961 Tc 3.2836 Tw (Counters. In) Tj 0 Tc -0.5625 Tw ( ) Tj 63.75 0 TD /F2 11.25 Tf 0.1093 Tc 0 Tw (Proceed) Tj 38.25 0 TD -0.0581 Tc 3.1518 Tw (ings of the International Conference on Parallel and Distributed) Tj 0 Tc -0.5625 Tw ( ) Tj -102 -18.75 TD -0.1015 Tc -0.461 Tw (Computing Systems) Tj 87 0 TD /F0 11.25 Tf -0.4375 Tc 0.625 Tw (, 2001.) Tj 29.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -138 -19.5 TD -0.4688 Tc 0 Tw (9.) Tj 8.25 0 TD 0 Tc 0.1875 Tw ( ) Tj 13.5 0 TD -0.1303 Tc 1.4845 Tw (Smith JE. A study of Branch Prediction Strategies. In ) Tj 252 0 TD /F2 11.25 Tf -0.1246 Tc 1.6246 Tw (Proceedings of the 8th ISCA) Tj 131.25 0 TD /F0 11.25 Tf -0.24 Tc 1.5525 Tw (, 1981; ) Tj -383.25 -18.75 TD -0.375 Tc 0 Tw (135) Tj 15.75 0 TD 0.375 Tc (\226) Tj 6 0 TD -0.4219 Tc (148.) Tj 18.75 0 TD 0 Tc 0.1875 Tw ( ) Tj -62.25 -18.75 TD -0.4375 Tc 0 Tw (10.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD -0.1609 Tc 1.5075 Tw (Yeh TY, Patt YN. Two Level Adaptive Training Branch Prediction. In ) Tj 329.25 0 TD /F2 11.25 Tf -0.1237 Tc 0 Tw (P) Tj 6.75 0 TD -0.141 Tc 1.0785 Tw (roceedings of the ) Tj -336 -18.75 TD -0.099 Tc 0 Tw (Micro) Tj 27.75 0 TD 0.0038 Tc (-) Tj 3 0 TD -0.375 Tc (24) Tj 11.25 0 TD /F0 11.25 Tf -0.2898 Tc 0.4773 Tw (, 1991, pp. 51) Tj 59.25 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1875 Tc (61.) Tj 14.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -141 -19.5 TD -0.4375 Tc 0 Tw (11.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD -0.1451 Tc 1.6183 Tw (Pan ST, So K, Rahmeh JT. Improving the Accuracy of Dynamic Branch Prediction Using ) Tj 0 -18.75 TD -0.1459 Tc 0.0834 Tw (Branch Correlation. In ) Tj 102 0 TD /F2 11.25 Tf -0.0864 Tc 0.0864 Tw (Proceedings of the ASPLOS V) Tj 134.25 0 TD /F0 11.25 Tf -0.3675 Tc 0.555 Tw (, 1992; 76) Tj 42.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.4375 Tc (84.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 114 0 obj 7476 endobj 112 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F1 8 0 R /F2 10 0 R >> /ProcSet 2 0 R >> /Contents 113 0 R >> endobj 116 0 obj << /Length 117 0 R >> stream BT 87.75 39 TD 0 0 0 rg /F0 12 Tf 0 Tc 0 Tw ( ) Tj 216 0 TD ( ) Tj -216 671.25 TD /F0 11.25 Tf -0.4375 Tc (12.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD -0.1306 Tc 0.3181 Tw (Evers M, Chang PY, Patt YN. Using Hybrid Branch Prediction to I) Tj 297.75 0 TD -0.0958 Tc 0.2833 Tw (mprove Branch Prediction ) Tj -297.75 -18.75 TD -0.1314 Tc 0.2252 Tw (Accuracy in the Presence of Context Switches. In ) Tj 222 0 TD /F2 11.25 Tf -0.1245 Tc 0.1245 Tw (Proceedings of the 23rd ISCA) Tj 132 0 TD /F0 11.25 Tf -0.2593 Tc 0.4468 Tw (, 1996; 3) Tj 37.5 0 TD 0.375 Tc 0 Tw (\226) Tj 6 0 TD -0.4375 Tc (10.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -432.75 -19.5 TD -0.4375 Tc 0 Tw (13.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD -0.1354 Tc 1.0729 Tw (Nair R. Dynamic Path) Tj 100.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.1111 Tc 0.4861 Tw (Based Branch Corelation. In ) Tj 131.25 0 TD /F2 11.25 Tf -0.1302 Tc 1.0677 Tw (Proceedings of the Micro) Tj 115.5 0 TD 0.0038 Tc 0 Tw (-) Tj 3 0 TD 0.375 Tc (28) Tj 12 0 TD /F0 11.25 Tf -0.3675 Tc 1.305 Tw (, 1995; 15) Tj 44.25 0 TD 0.0038 Tc 0 Tw (-) Tj -410.25 -18.75 TD -0.4375 Tc (23.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj -35.25 -18.75 TD -0.4375 Tc 0 Tw (14.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD /F2 11.25 Tf -0.0395 Tc 46.352 Tw (UltraSPARC User\222s Manual) Tj 219.75 0 TD /F0 11.25 Tf -0.16 Tc 46.8475 Tw (, Sun Microelectronics,) Tj 0 Tc 0.1875 Tw ( ) Tj 198 0 TD 0 0 1 rg ( ) Tj -417.75 -18.75 TD -0.1248 Tc 0 Tw (http://www.sun.com/processors/manuals/802) Tj 198 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.1875 Tc (7220) Tj 21.75 0 TD 0.0038 Tc (-) Tj 3.75 0 TD -0.2181 Tc (02.pdf) Tj ET 109.5 613.5 255.75 0.75 re f BT 365.25 615.75 TD 0 0 0 rg 0 Tc 0.1875 Tw ( ) Tj 3 0 TD -0.119 Tc 0.3065 Tw ([July 2003].) Tj 53.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -333.75 -19.5 TD -0.4375 Tc 0 Tw (15.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD /F2 11.25 Tf -0.0453 Tc 18.6828 Tw (Intel\256 Architecture Software Optimization Reference Manual) Tj 368.25 0 TD /F0 11.25 Tf -0.2138 Tc 19.1512 Tw (, Intel,) Tj 0 Tc 0.1875 Tw ( ) Tj -368.25 -18.75 TD -0.1155 Tc 0.303 Tw (http://www.intel.com/design/PentiumIII/manuals/ [December 2001].) Tj 302.25 0 TD 0 Tc 0.1875 Tw ( ) Tj -324 -18.75 TD -0.4375 Tc 0 Tw (16.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj 8.25 0 TD -0.0823 Tc 5.5198 Tw (Milenkovic M, Milenko) Tj 117 0 TD -0.0721 Tc 5.5096 Tw (vic A, Kulick J. Demystifying Intel Branch Predictors. In) Tj 0 Tc -0.5625 Tw ( ) Tj -117 -19.5 TD /F2 11.25 Tf -0.0899 Tc -0.0039 Tw (Proceedings of the Workshop on Duplicating, Deconstructing, and Debunking) Tj 346.5 0 TD /F0 11.25 Tf -0.3675 Tc 0.555 Tw (, 2002; 52) Tj 42.75 0 TD 0.0038 Tc 0 Tw (-) Tj 3.75 0 TD -0.4375 Tc (61.) Tj 13.5 0 TD 0 Tc 0.1875 Tw ( ) Tj ET endstream endobj 117 0 obj 2894 endobj 115 0 obj << /Type /Page /Parent 100 0 R /Resources << /Font << /F0 6 0 R /F2 10 0 R >> /ProcSet 2 0 R >> /Contents 116 0 R >> endobj 6 0 obj << /Type /Font /Subtype /TrueType /Name /F0 /BaseFont /TimesNewRoman /FirstChar 32 /LastChar 255 /Widths [ 250 333 408 500 500 833 778 180 333 333 500 564 250 333 250 278 500 500 500 500 500 500 500 500 500 500 278 278 564 564 564 444 921 722 667 667 722 611 556 722 722 333 389 722 611 889 722 722 556 722 667 556 611 722 722 944 722 722 611 333 278 333 469 500 333 444 500 444 500 444 333 500 500 278 278 500 278 778 500 500 500 500 333 389 278 500 500 722 500 500 444 480 200 480 541 778 500 778 333 500 444 1000 500 500 333 1000 556 333 889 778 611 778 778 333 333 444 444 350 500 1000 333 980 389 333 722 778 444 722 250 333 500 500 500 500 200 500 333 760 276 500 564 333 760 500 400 549 300 300 333 576 453 250 333 300 310 500 750 750 750 444 722 722 722 722 722 722 889 667 611 611 611 611 333 333 333 333 722 722 722 722 722 722 722 564 722 722 722 722 722 722 556 500 444 444 444 444 444 444 667 444 444 444 444 444 278 278 278 278 500 500 500 500 500 500 500 549 500 500 500 500 500 500 500 500 ] /Encoding /WinAnsiEncoding /FontDescriptor 7 0 R >> endobj 7 0 obj << /Type /FontDescriptor /FontName /TimesNewRoman /Flags 34 /FontBBox [ -250 -216 1165 1000 ] /MissingWidth 323 /StemV 73 /StemH 73 /ItalicAngle 0 /CapHeight 891 /XHeight 446 /Ascent 891 /Descent -216 /Leading 149 /MaxWidth 971 /AvgWidth 401 >> endobj 8 0 obj << /Type /Font /Subtype /TrueType /Name /F1 /BaseFont /TimesNewRoman,BoldItalic /FirstChar 32 /LastChar 255 /Widths [ 250 389 555 500 500 833 778 278 333 333 500 570 250 333 250 278 500 500 500 500 500 500 500 500 500 500 333 333 570 570 570 500 832 667 667 667 722 667 667 722 778 389 500 667 611 889 722 722 611 722 667 556 611 722 667 889 667 611 611 333 278 333 570 500 333 500 500 444 500 444 333 500 556 278 278 500 278 778 556 500 500 500 389 389 278 556 444 667 500 444 389 348 220 348 570 778 500 778 333 500 500 1000 500 500 333 1000 556 333 944 778 611 778 778 333 333 500 500 350 500 1000 333 1000 389 333 722 778 389 611 250 389 500 500 500 500 220 500 333 747 266 500 606 333 747 500 400 549 300 300 333 576 500 250 333 300 300 500 750 750 750 500 667 667 667 667 667 667 944 667 667 667 667 667 389 389 389 389 722 722 722 722 722 722 722 570 722 722 722 722 722 611 611 500 500 500 500 500 500 500 722 444 444 444 444 444 278 278 278 278 500 556 500 500 500 500 500 549 500 556 556 556 556 444 500 444 ] /Encoding /WinAnsiEncoding /FontDescriptor 9 0 R >> endobj 9 0 obj << /Type /FontDescriptor /FontName /TimesNewRoman,BoldItalic /Flags 16482 /FontBBox [ -250 -216 1189 1000 ] /MissingWidth 330 /StemV 131 /StemH 131 /ItalicAngle -11 /CapHeight 891 /XHeight 446 /Ascent 891 /Descent -216 /Leading 149 /MaxWidth 991 /AvgWidth 412 >> endobj 10 0 obj << /Type /Font /Subtype /TrueType /Name /F2 /BaseFont /TimesNewRoman,Italic /FirstChar 32 /LastChar 255 /Widths [ 250 333 420 500 500 833 778 214 333 333 500 675 250 333 250 278 500 500 500 500 500 500 500 500 500 500 333 333 675 675 675 500 920 611 611 667 722 611 611 722 722 333 444 667 556 833 667 722 611 722 611 500 556 722 611 833 611 556 556 389 278 389 422 500 333 500 500 444 500 444 278 500 500 278 278 444 278 722 500 500 500 500 389 389 278 500 444 667 444 444 389 400 275 400 541 778 500 778 333 500 556 889 500 500 333 1000 500 333 944 778 556 778 778 333 333 556 556 350 500 889 333 980 389 333 667 778 389 556 250 389 500 500 500 500 275 500 333 760 276 500 675 333 760 500 400 549 300 300 333 576 523 250 333 300 310 500 750 750 750 500 611 611 611 611 611 611 889 667 611 611 611 611 333 333 333 333 722 667 722 722 722 722 722 675 722 722 722 722 722 556 611 500 500 500 500 500 500 500 667 444 444 444 444 444 278 278 278 278 500 500 500 500 500 500 500 549 500 500 500 500 500 444 500 444 ] /Encoding /WinAnsiEncoding /FontDescriptor 11 0 R >> endobj 11 0 obj << /Type /FontDescriptor /FontName /TimesNewRoman,Italic /Flags 98 /FontBBox [ -250 -216 1165 1000 ] /MissingWidth 378 /StemV 73 /StemH 73 /ItalicAngle -11 /CapHeight 891 /XHeight 446 /Ascent 891 /Descent -216 /Leading 149 /MaxWidth 971 /AvgWidth 402 >> endobj 14 0 obj << /Type /Font /Subtype /TrueType /Name /F3 /BaseFont /TimesNewRoman,Bold /FirstChar 32 /LastChar 255 /Widths [ 250 333 555 500 500 1000 833 278 333 333 500 570 250 333 250 278 500 500 500 500 500 500 500 500 500 500 333 333 570 570 570 500 930 722 667 722 722 667 611 778 778 389 500 778 667 944 722 778 611 778 722 556 667 722 722 1000 722 722 667 333 278 333 581 500 333 500 556 444 556 444 333 500 556 278 333 556 278 833 556 500 556 556 444 389 333 556 500 722 500 500 444 394 220 394 520 778 500 778 333 500 500 1000 500 500 333 1000 556 333 1000 778 667 778 778 333 333 500 500 350 500 1000 333 1000 389 333 722 778 444 722 250 333 500 500 500 500 220 500 333 747 300 500 570 333 747 500 400 549 300 300 333 576 540 250 333 300 330 500 750 750 750 500 722 722 722 722 722 722 1000 722 667 667 667 667 389 389 389 389 722 722 778 778 778 778 778 570 778 722 722 722 722 722 611 556 500 500 500 500 500 500 722 444 444 444 444 444 278 278 278 278 500 556 500 500 500 500 500 549 500 556 556 556 556 500 556 500 ] /Encoding /WinAnsiEncoding /FontDescriptor 15 0 R >> endobj 15 0 obj << /Type /FontDescriptor /FontName /TimesNewRoman,Bold /Flags 16418 /FontBBox [ -250 -216 1165 1000 ] /MissingWidth 323 /StemV 136 /StemH 136 /ItalicAngle 0 /CapHeight 891 /XHeight 446 /Ascent 891 /Descent -216 /Leading 149 /MaxWidth 971 /AvgWidth 427 >> endobj 26 0 obj << /Type /Font /Subtype /TrueType /Name /F4 /BaseFont /Arial /FirstChar 32 /LastChar 255 /Widths [ 278 278 355 556 556 889 667 191 333 333 389 584 278 333 278 278 556 556 556 556 556 556 556 556 556 556 278 278 584 584 584 556 1015 667 667 722 722 667 611 778 722 278 500 667 556 833 722 778 667 778 722 667 611 722 667 944 667 667 611 278 278 278 469 556 333 556 556 500 556 556 278 556 556 222 222 500 222 833 556 556 556 556 333 500 278 556 500 722 500 500 500 334 260 334 584 750 556 750 222 556 333 1000 556 556 333 1000 667 333 1000 750 611 750 750 222 222 333 333 350 556 1000 333 1000 500 333 944 750 500 667 278 333 556 556 556 556 260 556 333 737 370 556 584 333 737 552 400 549 333 333 333 576 537 278 333 333 365 556 834 834 834 611 667 667 667 667 667 667 1000 722 667 667 667 667 278 278 278 278 722 722 778 778 778 778 778 584 778 722 722 722 722 667 667 611 556 556 556 556 556 556 889 500 556 556 556 556 278 278 278 278 556 556 556 556 556 556 556 549 611 556 556 556 556 500 556 500 ] /Encoding /WinAnsiEncoding /FontDescriptor 27 0 R >> endobj 27 0 obj << /Type /FontDescriptor /FontName /Arial /Flags 32 /FontBBox [ -250 -212 1217 1000 ] /MissingWidth 278 /StemV 80 /StemH 80 /ItalicAngle 0 /CapHeight 905 /XHeight 453 /Ascent 905 /Descent -212 /Leading 150 /MaxWidth 1014 /AvgWidth 441 >> endobj 30 0 obj << /Type /Font /Subtype /TrueType /Name /F5 /BaseFont /Arial,Bold /FirstChar 32 /LastChar 255 /Widths [ 278 333 474 556 556 889 722 238 333 333 389 584 278 333 278 278 556 556 556 556 556 556 556 556 556 556 333 333 584 584 584 611 975 722 722 722 722 667 611 778 722 278 556 722 611 833 722 778 667 778 722 667 611 722 667 944 667 667 611 333 278 333 584 556 333 556 611 556 611 556 333 611 611 278 278 556 278 889 611 611 611 611 389 556 333 611 556 778 556 556 500 389 280 389 584 750 556 750 278 556 500 1000 556 556 333 1000 667 333 1000 750 611 750 750 278 278 500 500 350 556 1000 333 1000 556 333 944 750 500 667 278 333 556 556 556 556 280 556 333 737 370 556 584 333 737 552 400 549 333 333 333 576 556 278 333 333 365 556 834 834 834 611 722 722 722 722 722 722 1000 722 667 667 667 667 278 278 278 278 722 722 778 778 778 778 778 584 778 722 722 722 722 667 667 611 556 556 556 556 556 556 889 556 556 556 556 556 278 278 278 278 611 611 611 611 611 611 611 549 611 611 611 611 611 556 611 556 ] /Encoding /WinAnsiEncoding /FontDescriptor 31 0 R >> endobj 31 0 obj << /Type /FontDescriptor /FontName /Arial,Bold /Flags 16416 /FontBBox [ -250 -212 1175 1000 ] /MissingWidth 326 /StemV 153 /StemH 153 /ItalicAngle 0 /CapHeight 905 /XHeight 453 /Ascent 905 /Descent -212 /Leading 150 /MaxWidth 979 /AvgWidth 479 >> endobj 49 0 obj << /Type /Font /Subtype /TrueType /Name /F6 /BaseFont /CourierNew /FirstChar 32 /LastChar 255 /Widths [ 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 ] /Encoding /WinAnsiEncoding /FontDescriptor 50 0 R >> endobj 50 0 obj << /Type /FontDescriptor /FontName /CourierNew /Flags 34 /FontBBox [ -250 -300 767 1000 ] /MissingWidth 639 /StemV 109 /StemH 109 /ItalicAngle 0 /CapHeight 833 /XHeight 417 /Ascent 833 /Descent -300 /Leading 133 /MaxWidth 639 /AvgWidth 600 >> endobj 84 0 obj << /Type /Font /Subtype /TrueType /Name /F7 /BaseFont /Symbol /FirstChar 30 /LastChar 255 /Widths [ 600 600 250 333 713 500 549 833 778 439 333 333 500 549 250 549 250 278 500 500 500 500 500 500 500 500 500 500 278 278 549 549 549 444 549 722 667 722 612 611 763 603 722 333 631 722 686 889 722 722 768 741 556 592 611 690 439 768 645 795 611 333 863 333 658 500 500 631 549 549 494 439 521 411 603 329 603 549 549 576 521 549 549 521 549 603 439 576 713 686 493 686 494 480 200 480 549 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 600 620 247 549 167 713 500 753 753 753 753 1042 987 603 987 603 400 549 411 549 549 713 494 460 549 549 549 549 1000 603 1000 658 823 686 795 987 768 768 823 768 768 713 713 713 713 713 713 713 768 713 790 790 890 823 549 250 713 603 603 1042 987 603 987 603 494 329 790 790 786 713 384 384 384 384 384 384 494 494 494 494 600 329 274 686 686 686 384 384 384 384 384 384 494 494 494 600 ] /FontDescriptor 85 0 R >> endobj 85 0 obj << /Type /FontDescriptor /FontName /Symbol /Flags 6 /FontBBox [ -250 -220 1246 1005 ] /MissingWidth 332 /StemV 109 /StemH 109 /ItalicAngle 0 /CapHeight 1005 /XHeight 503 /Ascent 1005 /Descent -220 /Leading 225 /MaxWidth 1038 /AvgWidth 600 >> endobj 2 0 obj [ /PDF /Text ] endobj 5 0 obj << /Kids [4 0 R 16 0 R 19 0 R 22 0 R 25 0 R 32 0 R ] /Count 6 /Type /Pages /Parent 118 0 R >> endobj 36 0 obj << /Kids [35 0 R 39 0 R 42 0 R 45 0 R 48 0 R 53 0 R ] /Count 6 /Type /Pages /Parent 118 0 R >> endobj 57 0 obj << /Kids [56 0 R 60 0 R 63 0 R 67 0 R 70 0 R 74 0 R ] /Count 6 /Type /Pages /Parent 118 0 R >> endobj 78 0 obj << /Kids [77 0 R 81 0 R 87 0 R 90 0 R 93 0 R 96 0 R ] /Count 6 /Type /Pages /Parent 118 0 R >> endobj 100 0 obj << /Kids [99 0 R 103 0 R 106 0 R 109 0 R 112 0 R 115 0 R ] /Count 6 /Type /Pages /Parent 118 0 R >> endobj 118 0 obj << /Kids [5 0 R 36 0 R 57 0 R 78 0 R 100 0 R ] /Count 30 /Type /Pages /MediaBox [ 0 0 612 792 ] >> endobj 1 0 obj << /Creator /CreationDate (D:20030723105038) /Title /Author /Producer (Acrobat PDFWriter 5.0 for Windows NT) >> endobj 3 0 obj << /Pages 118 0 R /Type /Catalog >> endobj xref 0 119 0000000000 65535 f 0000274685 00000 n 0000273979 00000 n 0000275009 00000 n 0000003587 00000 n 0000274010 00000 n 0000263150 00000 n 0000264239 00000 n 0000264499 00000 n 0000265600 00000 n 0000265878 00000 n 0000266974 00000 n 0000000019 00000 n 0000003566 00000 n 0000267244 00000 n 0000268345 00000 n 0000008219 00000 n 0000003741 00000 n 0000008198 00000 n 0000012884 00000 n 0000008351 00000 n 0000012863 00000 n 0000016841 00000 n 0000013004 00000 n 0000016820 00000 n 0000029823 00000 n 0000268616 00000 n 0000269703 00000 n 0000016973 00000 n 0000029801 00000 n 0000269957 00000 n 0000271048 00000 n 0000062090 00000 n 0000029991 00000 n 0000062068 00000 n 0000067282 00000 n 0000274119 00000 n 0000062258 00000 n 0000067261 00000 n 0000070191 00000 n 0000067427 00000 n 0000070170 00000 n 0000091276 00000 n 0000070312 00000 n 0000091254 00000 n 0000095142 00000 n 0000091433 00000 n 0000095121 00000 n 0000102666 00000 n 0000271311 00000 n 0000272396 00000 n 0000095275 00000 n 0000102645 00000 n 0000107298 00000 n 0000102823 00000 n 0000107277 00000 n 0000112424 00000 n 0000274230 00000 n 0000107443 00000 n 0000112403 00000 n 0000117027 00000 n 0000112545 00000 n 0000117006 00000 n 0000129165 00000 n 0000117183 00000 n 0000129143 00000 n 0000129276 00000 n 0000134170 00000 n 0000129362 00000 n 0000134149 00000 n 0000141708 00000 n 0000134303 00000 n 0000141687 00000 n 0000141819 00000 n 0000145723 00000 n 0000141905 00000 n 0000145702 00000 n 0000156250 00000 n 0000274341 00000 n 0000145868 00000 n 0000156228 00000 n 0000169757 00000 n 0000156419 00000 n 0000169735 00000 n 0000272655 00000 n 0000273721 00000 n 0000169868 00000 n 0000177914 00000 n 0000169955 00000 n 0000177893 00000 n 0000183988 00000 n 0000178071 00000 n 0000183967 00000 n 0000191277 00000 n 0000184145 00000 n 0000191256 00000 n 0000219476 00000 n 0000191445 00000 n 0000219454 00000 n 0000229969 00000 n 0000274452 00000 n 0000219621 00000 n 0000229946 00000 n 0000240598 00000 n 0000230116 00000 n 0000240575 00000 n 0000245389 00000 n 0000240745 00000 n 0000245367 00000 n 0000252203 00000 n 0000245536 00000 n 0000252181 00000 n 0000259894 00000 n 0000252339 00000 n 0000259872 00000 n 0000263014 00000 n 0000260041 00000 n 0000262992 00000 n 0000274569 00000 n trailer << /Size 119 /Root 3 0 R /Info 1 0 R /ID [] >> startxref 275060 %%EOF