Abstract
This paper introduces two tools for manual energy evaluation and runtime tuning developed at IT4Innovations in the READEX project. The MERIC library can be used for manual instrumentation and analysis of any application from the energy and time consumption point of view. Besides tracing, MERIC can also change environment and hardware parameters during the application runtime, which leads to energy savings.
MERIC stores large amounts of data, which are difficult to read by a human. The RADAR generator analyses the MERIC output files to find the best settings of evaluated parameters for each instrumented region. It generates a report and a MERIC configuration file for application production runs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Uncore frequency refers to frequency of subsystems in the physical processor package that are shared by multiple processor cores, e.g., L3 cache and on-chip ring interconnect.
- 2.
MERIC repository: https://code.it4i.cz/vys0053/meric.
- 3.
The Intel Haswell processors do not support floating-point instructions counters. MERIC approximates FLOPs/s based on the counter of Advanced Vector Extensions (AVX) calculation operations. For more information visit https://github.com/RRZE-HPC/likwid/wiki/FlopsHaswell.
- 4.
htop repository: https://github.com/hishamhm/htop.
- 5.
RADAR generator repository: https://code.it4i.cz/bes0030/readex-radar.
- 6.
ESPRESO library website: http://espreso.it4i.cz/.
References
Allinea MAP - C/C++ profiler and Fortran profiler for high performance Linux code. https://www.allinea.com/products/map
High definition energy efficiency monitoring. http://www.ena-hpc.org/2014/pdf/bull.pdf
Brodowski, D.: Linux CPUFreq. https://www.kernel.org/doc/Documentation/cpu-freq/index.txt
BSC: Power monitoring on mini-clusters. https://wiki.hca.bsc.es/dokuwiki/wiki:prototype:power_monitor#jetson-tx1
Dostal, Z., Horak, D., Kucera, R.: Total FETI-an easier implementable variant of the FETI method for numerical solution of elliptic PDE. Commun. Numer. Methods Eng. 22(12), 1155–1162 (2006). https://doi.org/10.1002/cnm.881
Eastep, J., et al.: Global extensible open power manager: a vehicle for HPC community collaboration on co-designed energy management solutions. In: Kunkel, J.M., Yokota, R., Balaji, P., Keyes, D. (eds.) ISC 2017. LNCS, vol. 10266, pp. 394–412. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58667-0_21
eLinux.org: Jetson/TX1 controlling performance. http://elinux.org/Jetson/TX1_Controlling_Performance
Hackenberg, D., Ilsche, T., Schuchart, J., Schöne, R., Nagel, W., Simon, M., Georgiou, Y.: HDEEM: high definition energy efficiency monitoring. In: Energy Efficient Supercomputing Workshop (E2SC), November 2014
Hackenberg, D., Schöne, R., Ilsche, T., Molka, D., Schuchart, J., Geyer, R.: An energy efficiency feature survey of the Intel Haswell processor. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPSW), May 2015
Hähnel, M., Döbel, B., Völp, M., Härtig, H.: Measuring energy consumption for short code paths using rapl. SIGMETRICS Perform. Eval. Rev. 40(3), 13–17 (2012). http://doi.acm.org/10.1145/2425248.2425252
Haidar, A., Jagode, H., Vaccaro, P., YarKhan, A., Tomov, S., Dongarra, J.: Investigating power capping toward energy-efficient scientific applications. Concurr. Comput.: Pract. Exp. e4485. https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4485
NVIDIA: NVIDIA Jetson. http://www.nvidia.com/object/embedded-systems-dev-kits-modules.html
Oleynik, Y., Gerndt, M., Schuchart, J., Kjeldsberg, P.G., Nagel, W.E.: Run-time exploitation of application dynamism for energy-efficient exascale computing (READEX). In: Plessl, C., El Baz, D., Cong, G., Cardoso, J.M.P., Veiga, L., Rauber, T. (eds.) 2015 IEEE 18th International Conference on Computational Science and Engineering (CSE), pp. 347–350. IEEE, Piscataway, October 2015
Rajovic, N., Rico, A., Mantovani, F., Ruiz, D., Vilarrubi, J.O., Gomez, C., Backes, L., Nieto, D., Servat, H., Martorell, X., Labarta, J., Ayguade, E., Adeniyi-Jones, C., Derradji, S., Gloaguen, H., Lanucara, P., Sanna, N., Mehaut, J.F., Pouget, K., Videau, B., Boyer, E., Allalen, M., Auweter, A., Brayford, D., Tafani, D., Weinberg, V., Brömmel, D., Halver, R., Meinke, J.H., Beivide, R., Benito, M., Vallejo, E., Valero, M., Ramirez, A.: The mont-blanc prototype: an alternative approach for HPC systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, pp. 38:1–38:12. IEEE Press, Piscataway (2016). http://dl.acm.org/citation.cfm?id=3014904.3014955
Riha, L., Brzobohaty, T., Markopoulos, A., Jarosova, M., Kozubek, T., Horak, D., Hapla, V.: Implementation of the efficient communication layer for the highly parallel total feti and hybrid total feti solvers. Parallel Comput. 57, 154–166 (2016)
Rountree, B., Lowenthal, D.K., de Supinski, B.R., Schulz, M., Freeh, V.W., Bletsch, T.K.: Adagio: making DVS practical for complex HPC applications. In: ICS (2009)
Schoene, R.: x86\(\_\)adapt. https://doc.zih.tu-dresden.de/hpc-wiki/bin/view/Compendium/X86Adapt
Schuchart, J., Gerndt, M., Kjeldsberg, P.G., Lysaght, M., Horák, D., Říha, L., Gocht, A., Sourouri, M., Kumaraswamy, M., Chowdhury, A., Jahre, M., Diethelm, K., Bouizi, O., Mian, U.S., Kružík, J., Sojka, R., Beseda, M., Kannan, V., Bendifallah, Z., Hackenberg, D., Nagel, W.E.: The READEX formalism for automatic tuning for energy efficiency. Computing 1–19 (2017). https://doi.org/10.1007/s00607-016-0532-7
Venkatesh, K., Lubomir, R., Michael, G., Anamika, C., Ondrej, V., Martin, B., David, H., Radim, S., Jakub, K., Michael, L.: Prace whitepaper: investigating and exploiting application dynamism for energy-efficient exascale computing (2017). www.prace-ri.eu
VI-HPS: Score-p user manual 3.1 (2017)
Vysocky, O., Beseda, M., Riha, L., Zapletal, J., Nikl, V., Lysaght, M., Kannan, V.: Evaluation of the HPC applications dynamic behavior in terms of energy consumption. In: Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering. Civil-Comp Press, Stirlingshire, Paper 3 (2017)
Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009). https://doi.org/10.1145/1498765.1498785
Acknowledgement
This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPU II) project “IT4Innovations excellence in science - LQ1602” and by the IT4Innovations infrastructure which is supported from the Large Infrastructures for Research, Experimental Development and Innovations project “IT4Innovations National Supercomputing Center – LM2015070”.
The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under grant agreement number 671657.
The work was additionally supported by VŠB – Technical University of Ostrava under the grant SP2017/165 and by the Barcelona Supercomputing Center under the grants 288777, 610402 and 671697.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Vysocky, O., Beseda, M., Říha, L., Zapletal, J., Lysaght, M., Kannan, V. (2018). MERIC and RADAR Generator: Tools for Energy Evaluation and Runtime Tuning of HPC Applications. In: Kozubek, T., et al. High Performance Computing in Science and Engineering. HPCSE 2017. Lecture Notes in Computer Science(), vol 11087. Springer, Cham. https://doi.org/10.1007/978-3-319-97136-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-97136-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97135-3
Online ISBN: 978-3-319-97136-0
eBook Packages: Computer ScienceComputer Science (R0)