Abstract
Edge profiling is a very common means for providing feedback on program behavior that can be used statically by an optimizer to produce highly optimized binaries. However collecting full edge profile carries a significant runtime overhead. This overhead creates addition problems for real-time applications, as it may prevent the system from meeting runtime deadlines and thus alter its behavior. In this paper we show how a low overhead sampling technique can be used to collect inaccurate profile which is later used to approximate the full edge profile using a novel technique based on the Minimum Cost Circulation Problem. The outcome is a machine independent profile gathering scheme that creates a slowdown of only 2%-3% during the training set, and produces an optimized binary which is only 0.6% less than a fully optimized one.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ball, T., Larus, J.R.: Optimally profiling and tracing programs. ACM Transactions on Programming Languages and Systems (July 1994)
Anderson, J., Bert, L.M., Dean, J., Ghemawat, S., Henzinger, M.R., Leung, S.-T., Sites, R.L., Vandevoorde, M.T., WaIdspurger, C.A., Weihl, W.E.: Continuous profiling: Where have all the cycles gone? In: Proceedings of the 16th Symposium on Operating Systems Principles (October 1997)
Reps, T., Ball, T., Das, M., Larus, J.: The use of program profiling for software maintenance with applications to the Year 2000 Problem. In: Jazayeri, M. (ed.) ESEC 1997 and ESEC-FSE 1997. LNCS, vol. 1301, pp. 432–449. Springer, Heidelberg (1997)
Wu, Y., Larus, J.R.: Static Branch Frequency and Program Profile Analysis. In: 27th IEEE/ACM InterÕl Symposium on Microarchitecture (MICRO-27) (November 1994)
Goldberg, V., Tarjan, R.E.: Finding minimum-cost circulations by canceling negative cycles. J. ACM 36, 873–886 (1989) Preliminary version appeared In: Proceedings of the 20th Annual ACM Symposium on Theory of Computing, pp. 388–397 (1987)
Haber, G., Henis, E.A., Eisenberg, V.: Reliable Post-link Optimizations Based on Partial Information. In: Proceedings of the 3rd Workshop on Feedback Directed and Dynamic Optimizations (December 2000)
Henis, E.A., Haber, G., Klausner, M., Warshavsky, A.: Feedback Based Post-link Optimization for Large Subsystems. In: Second Workshop on Feedback Directed Optimization, Haifa, Israel, pp. 13–20 (November 1999)
Nahshon, Bernstein, D.: FDPR - A Post-Pass Object Code Optimization Tool. In: Proc. Poster Session of the International Conference on Compiler Construction, pp. 97–104 (April 1996)
Romer, T., Voelker, G., Lee, D., Wolman, A., Wong, W., Levy, H., Bershad, B., Chen, B.: Instrumentation and Optimization of Win32/Intel Executables Using Etch. In: Proceedings of the USENIX Windows NT Workshop, pp. 1–7 (August 1997)
Schwarz, B., Debray, S., Andrews, G., Legendre, M.: PLTO: A link-Time Optimizer for the Intel IA-32 Architecture. In: Proceedings of Workshop on Binary Rewriting (September 2001)
Cohn, R., Goodwin, D., Lowney, P.G.: Optimizing Alpha Executables on Windows NT with Spike. Digital Technical Journal, Digital Equipment Corporation 9(4), 3–20 (1997)
Muth, R., Debray, S., Watterson, S.: alto: A Link-Time Optimizer for the Compaq Alpha, Technical Report 98-14, Dept. of Computer Science, The University of Arizona (December 1998)
Chang, P., et al.: Using Profile Information to Assist Classic Code Optimizations. Software-Practice and Experience 21(12), 1301–1321 (1991)
Ball, T., Larus, J.R.: Optimally Profiling and Tracing Programs. ACM Transactions on Programming Languages and Systems 16(4), 1319–1360 (1994)
Arnold, M., Ryder, B.: A framework for reducing the cost of instrumented code. In: SIGPLAN Conference on Programming Language Design and Implementation, pp. 168–179 (2001)
Arnold, M., Sweeney, P.F.: Approximating the calling context tree via sampling. IBM Research Report (July 2000)
Feller, P.T.: Value profiling for instructions and memory locations. Masters Thesis CS98-581, University of California San Diego (April 1998)
Zhuang, X., Serrano, M.J., Cain, H.W.: Accurate, Efficient, and Adaptive Calling Context Profiling. In: PLDI 2006 (2006)
Knuth, D.E., Stevenson, F.R.: Optimal measurement points for program frequency counts. BIT 13, 313–322 (1973)
SPEC CPU2000, http://www.spec.org/cpu2000
Spezialetti, M., Gupta, R.: Timed Perturbation Analysis: An Approach for Non-Intrusive Monitoring of Real-Time Computations. In: ACM SIGPLAN Workshop on Language, Compiler, and Tool Support for Real-Time Systems, Orlando, Florida (June 1994)
Cell SPE Oprofile patch, http://patchwork.ozlabs.org/linuxppc/patch?id=9627
Cell alphaworks SDK, http://www.alphaworks.ibm.com/topics/cell
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Levin, R., Newman, I., Haber, G. (2008). Complementing Missing and Inaccurate Profiling Using a Minimum Cost Circulation Algorithm. In: Stenström, P., Dubois, M., Katevenis, M., Gupta, R., Ungerer, T. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2008. Lecture Notes in Computer Science, vol 4917. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77560-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-540-77560-7_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77559-1
Online ISBN: 978-3-540-77560-7
eBook Packages: Computer ScienceComputer Science (R0)