Abstract
In this paper, we present an update on the scalable online support for performance data analysis and monitoring in TAU. Extending on our prior work with TAUoverSupermon and TAUoverMRNet, we show how online analysis operations can also be supported directly and scalably using the parallel infrastructure provided by an MPI application instrumented with TAU. We also report on efforts to streamline and update TAUoverMRNet. Together, these approaches form the basis for the investigation of online analysis capabilities in a TAU monitoring framework TAUmon. We discuss various analysis operations and capabilities enabled by online monitoring and how operations like event unification enable merged profiles to be produced with greatly reduced data volume prior to application shutdown. Scaling results with PFLOTRAN on the Cray XT5 and BG/P are presented along with a look at some initial performance information generated from FLASH through our TAUmon prototype frameworks.
Chapter PDF
Similar content being viewed by others
Keywords
- Parallel Performance
- Online Monitoring
- Analysis Operation
- Performance Data Analysis
- Instantiation Scheme
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Arnold, D.C., Pack, G.D., Miller, B.P.: Tree-based Overlay Networks for Scalable Applications. In: 11th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2006) (April 2006)
Bubak, M., Funika, W., Smętek, M., Kiliański, Z., Wismüller, R.: Architecture of monitoring system for distributed java applications. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) EuroPVM/MPI 2003. LNCS, vol. 2840, pp. 447–454. Springer, Heidelberg (2003)
Eisenhauer, G., Schwan, K.: An object-based infrastructure for program monitoring and steering. In: SPDT 1998: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, pp. 10–20. ACM, New York (1998)
Fryxell, B., Olson, K., Ricker, P., Timmes, F.X., Zingale, M., Lamb, D.Q., MacNeice, P., Rosner, R., Truran, J.W., Tufo, H.: FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes. The Astrophysical Journal Supplement Series 131(1), 273–334
Gerndt, M., Furlinger, K., Kereku, E.: Periscope: Advanced Techniques for Performance Analysis. Parallel Computing: Current and Future Issues of High-End Computing, 15–26 (September 2005)
Gu, W., Eisenhauer, G., Schwan, K., Vetter, J.: Falcon: On-line monitoring for steering parallel programs. In: Ninth International Conference on Parallel and Distributed Computing and Systems, pp. 699–736 (1998)
Lee, C.W.: Techniques in Scalable and Effective Parallel Performance Analysis. PhD thesis, Department of Computer Science, University of Illinois, Urbana-Champaign (December 2009)
Ludwig, T., Wismuller, R., Sunderam, V., Bode, A.: OMIS - on-line monitoring interface specification (version 2.0). LRR-TUM Research Report Series, 9 (1998)
Malony, A.D., Shende, S., Bell, R., Li, K., Li, L., Trebon, N.: Advances in the TAU Performance System, pp. 129–144 (2004)
Miller, B.P., Callaghan, M.D., Cargille, J.M., Hollingsworth, J.K., Irvin, R.B., Karavanic, K.L., Kunchithapadam, K., Newhall, T.: The paradyn parallel performance measurement tools. Computer 28(11), 37–46 (1995)
Mills, R.T., Lu, C., Lichtner, P.C., Hammond, G.E.: Simulating Subsurface Flow and Transport on Ultrascale Computers using PFLOTRAN. Journal of Physics: Conference Series 78, 012051 (2007)
Nataraj, A., Malony, A.D., Morris, A., Arnold, D.C., Miller, B.P.: A Framework for Scalable, Parallel Performance Monitoring using TAU and MRNet. International Workshop on Scalable Tools for High-End Computing (STHEC 2008) (June 2008)
Nataraj, A., Sottile, M., Morris, A., Malony, A.D., Shende, S.: TAUoverSupermon: Low-Overhead Online Parallel Performance Monitoring. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 85–96. Springer, Heidelberg (2007)
Ribler, R.L., Simitci, H., Reed, D.A.: The autopilot performance-directed adaptive control system. Future Gener. Comput. Syst. 18(1), 175–187 (2001)
Sottile, M.J., Minnich, R.G.: Supermon: a high-speed cluster monitoring system. In: Proceedings of IEEE International Conference on Cluster Computing, 2002, pp. 39–46 (2002)
Tapus, C., I-Hsin Chung, Hollingsworth, J.K.: Active harmony: Towards automated performance tuning. In: ACM/IEEE 2002 Conference on Supercomputing, November 16-22, pp. 44–44 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, C.W., Malony, A.D., Morris, A. (2011). TAUmon: Scalable Online Performance Data Analysis in TAU. In: Guarracino, M.R., et al. Euro-Par 2010 Parallel Processing Workshops. Euro-Par 2010. Lecture Notes in Computer Science, vol 6586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21878-1_61
Download citation
DOI: https://doi.org/10.1007/978-3-642-21878-1_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21877-4
Online ISBN: 978-3-642-21878-1
eBook Packages: Computer ScienceComputer Science (R0)