Locality Optimization of Stencil Applications Using Data Dependency Graphs | SpringerLink
Skip to main content

Locality Optimization of Stencil Applications Using Data Dependency Graphs

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6548))

Abstract

This paper proposes tiling techniques based on data dependencies and not in code structure.

The work presented here leverages and expands previous work by the authors in the domain of non traditional tiling for parallel applications.

The main contributions of this paper are: (1) A formal description of tiling from the point of view of the data produced and not from the source code. (2) A mathematical proof for an optimum tiling in terms of maximum reuse for stencil applications, addressing the disparity between computation power and memory bandwidth for many-core architectures. (3) A description and implementation of our tiling technique for well known stencil applications. (4) Experimental evidence that confirms the effectiveness of the tiling proposed to alleviate the disparity between computation power and memory bandwidth for many-core architectures. Our experiments, performed using one of the first Cyclops-64 many-core chips produced, confirm the effectiveness of our approach to reduce the total number of memory operations of stencil applications as well as the running time of the application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. del Cuvillo, J., Zhu, W., Hu, Z., Gao, G.R.: Toward a software infrastructure for the cyclops-64 cellular architecture. In: 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment, HPCS 2006, p. 9 (May 2006)

    Google Scholar 

  2. Garcia, E., Venetis, I.E., Khan, R., Gao, G.: Optimized dense matrix multiplication on a many-core architecture. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6272, pp. 316–327. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  3. Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 1988, pp. 319–329. ACM, New York (1988), http://doi.acm.org/10.1145/73560.73588

    Google Scholar 

  4. Krishnamoorthy, S., Baskaran, M., Bondhugula, U., Ramanujam, J., Rountev, A., Sadayappan, P.: Effective automatic parallelization of stencil computations. SIGPLAN Not. 42(6), 235–244 (2007)

    Article  Google Scholar 

  5. Lam, M.S., Wolf, M.E.: A data locality optimizing algorithm. SIGPLAN Not. 39(4), 442–459 (2004)

    Article  Google Scholar 

  6. Lim, A.W., Cheong, G.I., Lam, M.S.: An affine partitioning algorithm to maximize parallelism and minimize communication. In: ICS 1999: Proceedings of the 13th International Conference on Supercomputing, pp. 228–237. ACM, New York (1999)

    Chapter  Google Scholar 

  7. Orozco, D., Gao, G.: Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications. In: CAPSL Technical Memo 91. University of Delaware (2009)

    Google Scholar 

  8. Orozco, D., Gao, G.: Mapping the fdtd application for many core processor. In: International Conference on Parallel Processing ICPP (2009)

    Google Scholar 

  9. Rajopadhye, S.: Dependence analysis and parallelizing transformations. In: Srikant, Y.N.S., Shankar, P. (eds.) Handbook on Compiler Design, 1st edn. CRC Press, Boca Raton (2002) (in press)

    Google Scholar 

  10. Ramanujam, J., Sadayappan, P.: Tiling multidimensional iteration spaces for multicomputers. Journal of Parallel and Distributed Computing 16(2), 108–120 (1992)

    Article  Google Scholar 

  11. Schreiber, R., Dongarra, J.: Automatic Blocking of Nested Loops (1990)

    Google Scholar 

  12. Shirako, J., Peixotto, D.M., Sarkar, V., Scherer, W.N.: Phasers: a unified deadlock-free construct for collective and point-to-point synchronization. In: ICS 2008, pp. 277–288. ACM, New York (2008)

    Google Scholar 

  13. Venetis, I.E., Gao, G.R.: Mapping the LU Decomposition on a Many-Core Architecture: Challenges and Solutions. In: Proceedings of the 6th ACM Conference on Computing Frontiers (CF 2009), Ischia, Italy, pp. 71–80 (May 2009)

    Google Scholar 

  14. Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. SIGPLAN Not. 26(6), 30–44 (1991)

    Article  Google Scholar 

  15. Wolfe, M.: More iteration space tiling. In: Supercomputing 1989: Proceedings of the 1989 ACM/IEEE Conference on Supercomputing, pp. 655–664. ACM, New York (1989)

    Chapter  Google Scholar 

  16. Yee, K.: Numerical solution of inital boundary value problems involving maxwell’s equations in isotropic media. IEEE Transactions on Antennas and Propagation 14(3), 302–307 (1966)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Orozco, D., Garcia, E., Gao, G. (2011). Locality Optimization of Stencil Applications Using Data Dependency Graphs. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds) Languages and Compilers for Parallel Computing. LCPC 2010. Lecture Notes in Computer Science, vol 6548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19595-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19595-2_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19594-5

  • Online ISBN: 978-3-642-19595-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics