Abstract
Computation requirements in scientific fields are getting heavier and heavier. The advent of clustering systems provides an affordable alternative to expensive conventional supercomputers. However, parallel programming is not easy for noncomputer scientists to do. We developed the Directive-Based MPI Code Generator (DMCG) that transforms C program codes from sequential form to parallel message-passing form. We also introduce a loop scheduling method for load balancing that depends on a message-passing analyzer, and is easy and straightforward to use. This approach provides a completely different view of loop parallelism from that in the literature, which relies on dependence abstractions. Experimental results show our approach can achieve efficient outcomes, and DMCG could be a general-purpose tool to help parallel programming beginners construct programs quickly and port existing sequential programs to PC Clusters.
Similar content being viewed by others
References
Sterling TL, Salmon J, Becker DJ, Savarese DF (1999) How to build a Beowulf: a guide to the implementation and application of PC clusters, 2nd edn. MIT Press, Cambridge
Wilkinson B, Allen M (1999) Parallel programming: techniques and applications using networked workstations and parallel computers. Prentice Hall, New York
Buyya R (1999) High performance cluster computing: architectures and systems, vol. 1. Prentice Hall, New York
Message passing interface forum. http://www.mpi-forum.org/
PVM—parallel virtual machine. http://www.epm.ornl.gov/pvm/
TOP500 supercomputer sites. http://www.top500.org
Wolfe M (1996) Parallelizing compilers. ACM Comput Surv 28(1):261–262
Wolfe M (1996) High performance compilers for parallel computing. Addison-Wesley, Reading
Boulet P, Darte A, Silber G-A, Vivien F (1998) Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Comput 24:421–444
Banerjee U (1988) An introduction to a formal theory of dependence analysis. J Supercomput 2(2):133–149
Wolfe M (1989) More iteration space tiling. In: Proceedings of supercomputing, pp 655–664
Bacon DF et al. (1994) Compiler transformations for high-performance computing. ACM Comput Surv 26(4):245–320
Yang CT, Tseng SS, Fan YW, Tsai TK, Hsieh MH, Wu CT (2001) Using knowledge-based systems for research on portable parallelizing compilers. Concurr Comput Pract Exper 13:181–208
Hummel SF, Schonberg E, Flynn LE (1992) Factoring: a method for scheduling parallel loops. Commun ACM 35(8):90–101
Kruskal CP, Weiss A (1985) Allocating independent subtasks on parallel processors. IEEE Trans Softw Eng 11(10):1001–1016
Polychronopoulos CD, Kuck DJ (1987) Guided self-scheduling: a practical self-scheduling scheme for parallel supercomputers. IEEE Trans Comput 36(12):1425–1439
Tzen TH, Ni LM (1993) Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers. IEEE Trans Parallel Distrib Syst 4(1):87–98
Tang P, Yew PC (1986) Processor self-scheduling for multiple-nested parallel loops. In: International conference on parallel processing, pp 528–535
Li H, Tandri S, Stumm M, Sevcik KC (1993) Locality and loop scheduling on NUMA multiprocessors. In: International conference on parallel processing, vol II, pp 140–147
LAM/MPI parallel computing. http://www.lam-mpi.org/
MPICH—a portable implementation of MPI. http://www-unix.mcs.anl.gov/mpi/mpich/
MPI software technology. http://www.mpi-softtech.com/
McGarvey B, Cicconetti R, Bushyager N, Dalton E, Tentzeris M (2001) Beowulf cluster design for scientific PDE models. In: Proceedings of the 2001 annual Linux showcase, Oakland, CA, November 2001
Sedgewick R (1992) Algorithms in C++. Addison-Wesley, Reading, pp 476–478
Gorlatch S (2002) Message passing without send–receive. Future Gener Comput Syst 18:797–805
Luecke GR, Raffin B, Coyle JJ (1999) The performance of the MPI collective communication routines for large messages on the Cray T3E600, the Cray Origin 2000, and the IBM SP. J Perform Eval Model Comput Syst, July 1999
Beletsky V, Bagaterenco A, Chemeris A (1995) A package for automatic parallelization of serial C-programs for distributed systems. In: Proceedings of the conference on programming models for massively parallel computers, pp 184–188
Zhang F, D’Hollander EH (1994) Extracting the parallelism in programs with unstructured control statements. In: Proceedings of international conference on parallel and distributed systems. IEEE, New York, pp 264–270
The Stanford SUIF compiler group. http://suif.stanford.edu
Di Martino B, Mazzeo A, Mazzoccaa N, Villano U (2001) Parallel program analysis and restructuring by detection of point-to-point interaction patterns and their transformation into collective communication constructs. Sci Comput Program 40:235–263
Allen JR, Kennedy K (1987) Automatic translation of Fortran programs to vector form. ACM Trans Program Lang Syst 9(4):491–542
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, CT., Lai, KC. A directive-based MPI code generator for Linux PC clusters. J Supercomput 50, 177–207 (2009). https://doi.org/10.1007/s11227-008-0258-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-008-0258-1