A directive-based MPI code generator for Linux PC clusters

Yang, Chao-Tung; Lai, Kuan-Chou

doi:10.1007/s11227-008-0258-1

A directive-based MPI code generator for Linux PC clusters

Published: 16 December 2008

Volume 50, pages 177–207, (2009)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Chao-Tung Yang¹ &
Kuan-Chou Lai²

101 Accesses
2 Citations
Explore all metrics

Abstract

Computation requirements in scientific fields are getting heavier and heavier. The advent of clustering systems provides an affordable alternative to expensive conventional supercomputers. However, parallel programming is not easy for noncomputer scientists to do. We developed the Directive-Based MPI Code Generator (DMCG) that transforms C program codes from sequential form to parallel message-passing form. We also introduce a loop scheduling method for load balancing that depends on a message-passing analyzer, and is easy and straightforward to use. This approach provides a completely different view of loop parallelism from that in the literature, which relies on dependence abstractions. Experimental results show our approach can achieve efficient outcomes, and DMCG could be a general-purpose tool to help parallel programming beginners construct programs quickly and port existing sequential programs to PC Clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

References

Sterling TL, Salmon J, Becker DJ, Savarese DF (1999) How to build a Beowulf: a guide to the implementation and application of PC clusters, 2nd edn. MIT Press, Cambridge
Google Scholar
Wilkinson B, Allen M (1999) Parallel programming: techniques and applications using networked workstations and parallel computers. Prentice Hall, New York
Google Scholar
Buyya R (1999) High performance cluster computing: architectures and systems, vol. 1. Prentice Hall, New York
Google Scholar
Message passing interface forum. http://www.mpi-forum.org/
PVM—parallel virtual machine. http://www.epm.ornl.gov/pvm/
TOP500 supercomputer sites. http://www.top500.org
Wolfe M (1996) Parallelizing compilers. ACM Comput Surv 28(1):261–262
Article MathSciNet Google Scholar
Wolfe M (1996) High performance compilers for parallel computing. Addison-Wesley, Reading
MATH Google Scholar
Boulet P, Darte A, Silber G-A, Vivien F (1998) Loop parallelization algorithms: from parallelism extraction to code generation. Parallel Comput 24:421–444
Article MATH MathSciNet Google Scholar
Banerjee U (1988) An introduction to a formal theory of dependence analysis. J Supercomput 2(2):133–149
Article Google Scholar
Wolfe M (1989) More iteration space tiling. In: Proceedings of supercomputing, pp 655–664
Bacon DF et al. (1994) Compiler transformations for high-performance computing. ACM Comput Surv 26(4):245–320
Article Google Scholar
Yang CT, Tseng SS, Fan YW, Tsai TK, Hsieh MH, Wu CT (2001) Using knowledge-based systems for research on portable parallelizing compilers. Concurr Comput Pract Exper 13:181–208
Article MATH Google Scholar
Hummel SF, Schonberg E, Flynn LE (1992) Factoring: a method for scheduling parallel loops. Commun ACM 35(8):90–101
Article Google Scholar
Kruskal CP, Weiss A (1985) Allocating independent subtasks on parallel processors. IEEE Trans Softw Eng 11(10):1001–1016
Article Google Scholar
Polychronopoulos CD, Kuck DJ (1987) Guided self-scheduling: a practical self-scheduling scheme for parallel supercomputers. IEEE Trans Comput 36(12):1425–1439
Article Google Scholar
Tzen TH, Ni LM (1993) Trapezoid self-scheduling: a practical scheduling scheme for parallel compilers. IEEE Trans Parallel Distrib Syst 4(1):87–98
Article Google Scholar
Tang P, Yew PC (1986) Processor self-scheduling for multiple-nested parallel loops. In: International conference on parallel processing, pp 528–535
Li H, Tandri S, Stumm M, Sevcik KC (1993) Locality and loop scheduling on NUMA multiprocessors. In: International conference on parallel processing, vol II, pp 140–147
LAM/MPI parallel computing. http://www.lam-mpi.org/
MPICH—a portable implementation of MPI. http://www-unix.mcs.anl.gov/mpi/mpich/
MPI software technology. http://www.mpi-softtech.com/
McGarvey B, Cicconetti R, Bushyager N, Dalton E, Tentzeris M (2001) Beowulf cluster design for scientific PDE models. In: Proceedings of the 2001 annual Linux showcase, Oakland, CA, November 2001
Sedgewick R (1992) Algorithms in C++. Addison-Wesley, Reading, pp 476–478
Google Scholar
Gorlatch S (2002) Message passing without send–receive. Future Gener Comput Syst 18:797–805
Article MATH Google Scholar
Luecke GR, Raffin B, Coyle JJ (1999) The performance of the MPI collective communication routines for large messages on the Cray T3E600, the Cray Origin 2000, and the IBM SP. J Perform Eval Model Comput Syst, July 1999
Beletsky V, Bagaterenco A, Chemeris A (1995) A package for automatic parallelization of serial C-programs for distributed systems. In: Proceedings of the conference on programming models for massively parallel computers, pp 184–188
Zhang F, D’Hollander EH (1994) Extracting the parallelism in programs with unstructured control statements. In: Proceedings of international conference on parallel and distributed systems. IEEE, New York, pp 264–270
Chapter Google Scholar
The Stanford SUIF compiler group. http://suif.stanford.edu
Di Martino B, Mazzeo A, Mazzoccaa N, Villano U (2001) Parallel program analysis and restructuring by detection of point-to-point interaction patterns and their transformation into collective communication constructs. Sci Comput Program 40:235–263
Article MATH Google Scholar
Allen JR, Kennedy K (1987) Automatic translation of Fortran programs to vector form. ACM Trans Program Lang Syst 9(4):491–542
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

High-Performance Computing Laboratory, Department of Computer Science, Tunghai University, Taichung, 40704, Taiwan
Chao-Tung Yang
Department of Computer and Information Science, National Taichung University, Taichung, 40306, Taiwan
Kuan-Chou Lai

Authors

Chao-Tung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kuan-Chou Lai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao-Tung Yang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, CT., Lai, KC. A directive-based MPI code generator for Linux PC clusters. J Supercomput 50, 177–207 (2009). https://doi.org/10.1007/s11227-008-0258-1

Download citation

Received: 11 May 2008
Accepted: 26 November 2008
Published: 16 December 2008
Issue Date: November 2009
DOI: https://doi.org/10.1007/s11227-008-0258-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

A directive-based MPI code generator for Linux PC clusters

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MPI4All: Universal Binding Generation for MPI Parallel Programming

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach

Extending OP2 framework to support portable parallel programming of complex applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A directive-based MPI code generator for Linux PC clusters

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MPI4All: Universal Binding Generation for MPI Parallel Programming

Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach

Extending OP2 framework to support portable parallel programming of complex applications

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation