Abstract
When executing Fortran 90 style data-parallel array operations on distributed-memory multiprocessors, intraprocessor data movement due to shift operations can account for a significant fraction of the execution time. This paper describes a strategy for minimizing data movement caused by Fortran 90 CSHIFT operations and presents a compiler technique that exploits this strategy automatically. The compiler technique is global in scope and can reduce data movement even when a definition of an array and its uses are separated by control flow. This technique supersedes those whose scope is restricted to a single statement. We focus on the application of this strategy on distributed-memory architectures, although it is more broadly applicable.
This research supported in part by the NSF Cooperative Research Agreement Number CCR-9120008.
Supported in part by the IBM Corporation through the Graduate Resident Study Program.
Preview
Unable to display preview. Download preview PDF.
References
J. R. Allen. Dependence Analysis for Subscripted Variables and Its Application to Program Transformations. PhD thesis, Dept. of Computer Science, Rice University, April 1983.
R. G. Brickner, W. George, S. L. Johnsson, and A. Ruttenberg. A stencil compiler for the Connection Machine models CM-2/200. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.
P. Briggs. Register Allocation via Graph Coloring. PhD thesis, Dept. of Computer Science, Rice University, April 1992.
M. Bromley, S. Heller, T. McNerney, and G. Steele, Jr. Fortran at ten gigaflops: The Connection Machine convolution compiler. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991.
S. Carr. Memory-Hierarchy Management. PhD thesis, Dept. of Computer Science, Rice University, September 1992.
A. Choudhary, G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, S. Ranka, and C.-W. Tseng. Unified compilation of Fortran 77D and 90D. ACM Letters on Programming Languages and Systems, 2(1–4):95–114, March–December 1993.
R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems, 13(4):451–490, October 1991.
R. Fatoohi. Performance analysis of four SIMD machines. In Proceedings of the 1993 ACM International Conference on Supercomputing, Tokyo, Japan, July 1993.
M. Gerndt. Updating distributed variables in local computations. Concurrency: Practice and Experience, 2(3):171–193, September 1990.
K. Gopinath and J. L. Hennessy. Copy elimination in functional languages. In Proceedings of the Sixteenth Annual ACM Symposium on the Principles of Programming Languages, Austin, TX, January 1989.
High Performance Fortran Forum. High Performance Fortran language specification. Scientific Programming, 2(1–2): 1–170, 1993.
S. L. Johnsson. Language and compiler issues in scalable high performance scientific libraries. In Proceedings of the Third Workshop on Compilers for Parallel Computers, Vienna, Austria, July 1992.
K. Kennedy and G. Roth. Context optimization for SIMD execution. In Proceedings of the 1994 Scalable High Performance Computing Conference, Knoxville, TN, May 1994.
K. Knobe, J. Lukas, and M. Weiss. Optimization techniques for SIMD Fortran compilers. Concurrency: Practice and Experience, 5(7):527–552, October 1993.
G. Sabot. A compiler for a massively parallel distributed memory MIMD computer. In Frontiers '92: The 4th Symposium on the Frontiers of Massively Parallel Computation, McLean, VA, October 1992.
P. Schnorf, M. Ganapathi, and J. Hennessy. Compile-time copy elimination. Software—Practice and Experience, 23(11):1175–1200, November 1993.
J. T. Schwartz. Optimization of very high level languages — I. Value transmission and its corollaries. Computer Languages, 1(2):161–194, 1975.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kennedy, K., Mellor-Crummey, J., Roth, G. (1996). Optimizing Fortran 90 shift operations on distributed-memory multicomputers. In: Huang, CH., Sadayappan, P., Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1995. Lecture Notes in Computer Science, vol 1033. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0014198
Download citation
DOI: https://doi.org/10.1007/BFb0014198
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60765-6
Online ISBN: 978-3-540-49446-1
eBook Packages: Springer Book Archive