Abstract
In [8], a high availability framework based on Harary graph as network topology has been proposed for stateful failover. Framework proposed therein exhibits an interesting property that an uniform load can be given to each non-faulty node while maintaining fault tolerance. A challenging problem in this context, which has not been addressed in [8] is to be able to come up with a distributed algorithm of automated fault recovery which can exploit the properties exhibited by the framework. In this work, we propose a distributed algorithm with low message and round complexity for automated fault recovery in case of stateful failover. We then prove the correctness of the algorithm using techniques from formal verification. The safety, liveness and the timeliness properties of the algorithm have been verified by the model checker SPIN.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Harary, F.: The Maximum Connectivity of a Graph. Proc. Nat. Acad. Sci. U.S.A. 48, 1142–1146 (1962)
Kuhl, J.G., Reddy, S.M.: Distributed Fault Tolerance for Large Multiprocessor Systems. Computer Architecture News (7th Intl. Symposium on Computer Architectures) 8, 23–30 (1980)
Yang, C.L., Massona, G.M.: Distributed Algorithm for Fault Diagnosis in Systems with Soft Failures. IEEE Transaction on Computers 37(11) (1988)
Sridhar, M.A., Raghavendra, C.S.: Fault-Tolerant Network Based on De Bruijn Graph. IEEE Transaction On Computers 40(10) (1991)
Mukhopadhyay, K., Sinha, B.P.: Hamiltonian Graphs with Minimum Number of Edges For Fault-Tolerant Topologies. Information Processing Letters 44, 95–99 (1992)
Sung, T.Y., Ho, T.Y., Hsu, L.H.: Optimal k-Fault-Tolerant Networks for Token Rings. Journal of Information Science and Engineering 16, 381–390 (2000)
Hung, C., Hsu, L., Sung, T.: On the construction of combined k-fault-tolerant Hamiltonian graphs. Networks 37(3), 165–170 (2001)
Saha, I., Mukhopadhyay, D., Banerjee, S.: Designing Reliable Architecture for Stateful Fault Tolerance. In: Proceedings of Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’06), pp. 545–551 (2006)
Holtzman, G.J.: The SPIN Model Checker, Primer and Reference Manual. Addison-Wesley, Reading (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saha, I., Mukhopadhyay, D. (2007). A Distributed Algorithm of Fault Recovery for Stateful Failover. In: Cai, JY., Cooper, S.B., Zhu, H. (eds) Theory and Applications of Models of Computation. TAMC 2007. Lecture Notes in Computer Science, vol 4484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72504-6_67
Download citation
DOI: https://doi.org/10.1007/978-3-540-72504-6_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72503-9
Online ISBN: 978-3-540-72504-6
eBook Packages: Computer ScienceComputer Science (R0)