Abstract
An efficient distributed fault‐tolerant routing algorithm for the hypercube is proposed based on the existence of a complete set of node‐disjoint paths between any two nodes. Node failure and repairs may occur dynamically provided that the total number of faulty nodes at any time is less than the node‐connectivity n of the n‐cube. Each node maintains for each possible destination which of the associated node‐disjoint paths to use. When a message is blocked by a node failure, the source node is warned and requested to switch to a different node‐disjoint path. The methods used to identify the paths, to propagate node failure information to source nodes, and to switch from one routing path to another incur little communication and computation overhead. We show that if the faults occur reasonably apart in time, then all messages will be routed on optimal or near optimal paths. In the unlikely case where many faults occur in a short period, the algorithm still delivers all messages but via possibly longer paths. An extension of the obtained algorithm to handle link failures in addition to node failures is discussed. We also show how to adapt the algorithm to n‐ary n‐cube networks. The algorithm can be similarly adapted to any interconnection network for which there exists a simple characterization of node‐disjoint paths between its nodes.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
S. Borkar, R. Cohen, G. Cox, S. Gleason, T. Gross, H.T. Kung, M. Lam, B. Moore, C. Peterson, J. Pieper, L. Rankin, P.S. Tseng, J. Sutton, J. Urbanski and J. Webb, iWarp: An integrated solution to high-speed parallel computing, in: Proc. of Supercomputing' 88 (November 1988) pp. 330–339.
B. Bose, B. Broeg, Y. Kwon and Y. Ashir, Lee distance and topological properties of k-ary n-cubes, IEEE Transactions on Computers 44(8) (1995) 1021–1030.
M.S. Chen and K.G. Shin, Depth-first search approach for fault-tolerant routing in hypercube multicomputers, IEEE Transactions on Parallel and Distributed Systems 1(2) (1990) 152–159.
M.S. Chen and K.G. Shin, Adaptive fault-tolerant routing in hypercube multicomputers, IEEE Transactions on Computers 39(12) (1990) 1406–1416.
G.M. Chiu and S.-P. Wu, A fault-tolerant routing strategy in hypercube multicomputers, IEEE Transactions on Computers 45(2) (1996) 143–155.
W.J. Dally, A. Chien, S. Fiske, W. Horwat, J. Keen, M. Larivee, R. Lethin, P. Nuth, S. Wills, P. Carrick and G. Fyler, The J-machine: A fine-grain concurrent computer, in: Information Processing' 89 (Elsevier Science, Amsterdam, 1989) pp. 1147–1153.
J.M. Gordon and Q.F. Stout, Hypercube message routing in the presence of faults, in: Proc. of the 3d Conf. on Hypercube Concurrent Computers and Applications (January 1988) pp. 251–263.
L. Gravano, G. Pifarre, P. Berman and J. Sanz, Adaptive deadlock-and livelock-free routing with minimal paths in torus networks, IEEE Transactions on Parallel and Distributed Systems 5(12) (1994) 1233–1251.
W.D. Hillis, The connection machine, Scientific American 256(6) (1987) 108–115.
Y. Lan, A fault-tolerant routing algorithm in hypercubes, in: Proc. of 1994 Internat. Conf. on Parallel Processing (August 1994) pp. III 163–166.
T.C. Lee and J.P. Hayes, A fault-tolerant communication scheme for hypercube computers, IEEE Transactions on Computers 41(10) (1992) 1242–1256.
D. Linder and J. Harden, An adaptive and fault tolerant wormhole routing strategy for k-ary n-cubes, IEEE Transactions on Computers 40(1) (1991) 2–12.
Y. Saad and M. Schultz, Topological properties of hypercubes, IEEE Transactions on Computers 37(7) (1988) 867–871.
C.L. Seitz, The cosmic cube, Communications of ACM 28 (July 1985) 22–23.
C.L. Seitz et al., The architecture and programming of the Ametek series 2010, in: Proc. of the 3rd Conf. on Hypercube Concurrent Computers and Applications, Pasadena, CA (January 1988) pp. 33–37.
C.L. Seitz, W.C. Athas, C.M. Flaig, A.J. Martin, J. Scizovic, C.S. Steele and W.K. Su, Submicron systems architecture project, Semiannual Technical Report, Caltec-CS-TR-88-18, California Institute of Technology (November 1988).
S.B. Tien and C.S. Raghavendra, Algorithms and bounds for shortest paths and diameter for faulty hypercubes, IEEE Transactions on Parallel and Distributed Systems 4(6) (1993) 713–718.
J. Wu, Reliable unicasting in faulty hypercubes using safety levels, IEEE Transactions on Computers 46(2) (1997) 241–247.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Day, K., Harous, S. & Al‐Ayyoub, A. A fault tolerant routing scheme for hypercubes. Telecommunication Systems 13, 29–44 (2000). https://doi.org/10.1023/A:1019171418147
Issue Date:
DOI: https://doi.org/10.1023/A:1019171418147