Abstract
The lack of fault tolerance is becoming a limiting factor for application scalability in HPC systems. The MPI does not provide standardized fault tolerance interfaces and semantics. The MPI Forum’s Fault Tolerance Working Group is proposing a collective fault tolerant agreement algorithm for the next MPI standard. Such algorithms play a central role in many fault tolerant applications. This paper combines a log-scaling two-phase commit agreement algorithm with a reduction operation to provide the necessary functionality for the new collective without any additional messages. Error handling mechanisms are described that preserve the fault tolerance properties while maintaining overall scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Barborak, M., Dahbura, A., Malek, M.: The consensus problem in fault-tolerant computing. ACM Computing Surveys 25, 171–220 (1993)
Cappello, F., Geist, A., Gropp, B., Kale, L., Kramer, B., Snir, M.: Toward exascale resilience. International Journal of High Performance Computing Applications 23(4), 374–388 (2009)
Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: An engineering perspective. In: Proceedings of the Twenty-sixth Annual ACM Symposium on Principles of Distributed Computing, PODC 2007, pp. 398–407. ACM, New York (2007)
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43, 225–267 (1996)
Engelmann, C., Geist, G.A.: Super-scalable algorithms for computing on 100,000 processors. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3514, pp. 313–321. Springer, Heidelberg (2005)
Fagg, G.E., Gabriel, E., Chen, Z., Angskun, T., Bosilca, G., Pjesivac-Grbovic, J., Dongarra, J.J.: Process fault-tolerance: Semantics, design and applications for high performance computing. International Journal for High Performance Applications and Supercomputing 19(4), 465–478 (2005)
Fault Tolerance Working Group: Run-though stabilization proposal, svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B.W., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Gray, J.: Notes on data base operating systems. In: Operating Systems, An Advanced Course, pp. 393–481. Springer, London (1978)
Huang, K.H., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Transactions on Computers 33(6), 518–528 (1984)
Hursey, J., Graham, R.: Preserving collective performance across process failure for a fault tolerant MPI. In: 16th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS) Held in Conjunction with the 25th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Anchorage, Alaska (May 2011)
Lamport, L.: The part-time parliament. ACM Transactions on Computer Systems (TOCS) 16, 133–169 (1998)
Message Passing Interface Forum: MPI: A Message Passing Interface. In: Proceedings of Supercomputing 1993, pp. 878–883. IEEE Computer Society Press, Los Alamitos (1993)
Mohan, C., Lindsay, B.: Efficient commit protocols for the tree of processes model of distributed transactions. ACM SIGOPS Operating Systems Review 19, 40–52 (1985)
Mohan, C., Lindsay, B., Obermarck, R.: Transaction management in the R* distributed database management system. ACM Transactions on Database Systems (TODS) 11, 378–396 (1986)
Raz, Y.: The dynamic two phase commitment (d2pc) protocol. In: Vardi, M.Y., Gottlob, G. (eds.) ICDT 1995. LNCS, vol. 893, pp. 162–176. Springer, Heidelberg (1995)
Skeen, D.: Nonblocking commit protocols. In: Proceedings of the 1981 ACM SIGMOD International Conference on Management of Data, SIGMOD, pp. 133–142. ACM, New York (1981)
Stonebraker, M.: Concurrency control and consistency of multiple copies of data in distributed Ingres. IEEE Transactions on Software Engineering SE 5(3), 188–194 (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hursey, J., Naughton, T., Vallee, G., Graham, R.L. (2011). A Log-Scaling Fault Tolerant Agreement Algorithm for a Fault Tolerant MPI. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds) Recent Advances in the Message Passing Interface. EuroMPI 2011. Lecture Notes in Computer Science, vol 6960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24449-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-24449-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24448-3
Online ISBN: 978-3-642-24449-0
eBook Packages: Computer ScienceComputer Science (R0)