Abstract
This paper introduces Numprof, a profiling framework for performance analysis of numerical libraries. The framework consists of a profiler and replayer for the BLAS and FFTW3 libraries. The profiler records library call events with a user configurable amount of detail. The replayer can be used to execute library calls based on the profiling trace files generated by the profiler. We explore real-world use cases for the framework and demonstrate that due to its low overhead it is feasible to be used for continuous statistical analysis of numerical library calls.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dongarra, J.J., Croz, J.D., Hammarling, S., Hanson, R.J.: An extended set of Fortran basic linear algebra subprograms. ACM Transactions on Mathematical Software 14, 117 (1986)
Frigo, M., Johnson, S.G.: The design and implementation of fftw3. In: Proceedings of the IEEE, pp. 216–231 (2005)
Graham, S.L., Kessler, P.B., McKusick, M.K.: gprof: a call graph execution profiler (1982)
Myers, D.S., Bazinet, A.L.: Intercepting arbitrary functions on Windows, UNIX, and Macintosh OS X platforms. Institute for Advanced Computer Studies. University of Maryland, CS-TR-4585, UMIACS-TR-2004-28 (2004)
Roth, P.C.: Characterizing the i/o behavior of scientific applications on the cray xt. In: Proceedings of the 2nd International Workshop on Petascale Data Storage: held in Conjunction with Supercomputing 2007 (PDSW 2007), pp. 50–55. ACM, New York (2007)
Sunderland, A., Pickles, S., Nikolic, M., Jovic, A., Jakic, J., Slavnic, V., Girotto, I., Nash, P., Lysaght, M.: An Analysis of FFT Performance in PRACE Application Codes, PRACE whitepaper (2012)
Benchmarking Single- and Multi-Core BLAS Implementations and GPUs for use with R, http://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf
Boisvert, R.F., Boisvert, R.F., Pozo, R., Pozo, R., Remington, K.A., Remington, K.A.: The matrix market exchange formats: Initial design. NISTIR, 5935
Vetter, J.S., Mueller, F.: Communication characteristics of large-scale scientific applications for contemporary cluster architectures. In: International Parallel and Distributed Processing Symposium (2002)
Nath, R., Tomov, S., Dongarra, J.: Accelerating GPU Kernels for Dense Linear Algebra. In: Palma, J.M.L.M., Daydé, M., Marques, O., Lopes, J.C. (eds.) VECPAR 2010. LNCS, vol. 6449, pp. 83–92. Springer, Heidelberg (2011)
NVidia CUDA FFT Library, http://developer.nvidia.com/cuda/cufft
NVidia CUDA BLAS Library, http://developer.nvidia.com/cublas
Anderson, E., Bai, Z., Dongarra, J., Greenbaum, A., McKenney, A., Du Croz, J., Hammerling, S., Demmel, J., Bischof, C., Sorensen, D.: Lapack: a portable linear algebra library for high-performance computers. In: Proceedings of the 1990 ACM/IEEE Conference on Supercomputing, Supercomputing 1990, pp. 2–11. IEEE Computer Society Press, Los Alamitos (1990)
Simpson, A.D., Bull, M., Hill, J.: Identification and Categorisation of Applications and Initial Benchmarks Suite. PRACE Technical Report (2008)
Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000) (Fall)
Koziol, Q., Matzke, R.: HDF5 - A New Generation of HDF: Reference Manual and User’s Guide. NCSA (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lehto, OP. (2013). Numprof: A Performance Analysis Framework for Numerical Libraries. In: Manninen, P., Öster, P. (eds) Applied Parallel and Scientific Computing. PARA 2012. Lecture Notes in Computer Science, vol 7782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36803-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-36803-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36802-8
Online ISBN: 978-3-642-36803-5
eBook Packages: Computer ScienceComputer Science (R0)