An Analytical Model for Loop Tiling Transformation | SpringerLink
Skip to main content

An Analytical Model for Loop Tiling Transformation

  • Conference paper
  • First Online:
Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13227))

Included in the following conference series:

Abstract

Loop tiling is a well-known loop transformation that enhances data locality in memory hierarchy. In this paper, we initially reveal two important inefficiencies of current analytical loop tiling models and we provide the theoretical background on how current analytical models can address these inefficiencies. To this end, we propose a new analytical model which is more accurate that the existing ones. We showcase, both theoretically and experimentally, that the proposed model can accurately estimate the number of cache misses for every generated tile size and as a result more efficient tile sizes are opted. Our evaluation results provide high cache misses gains and significant performance gains over gcc compiler and Pluto tool on an x86 platform.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 9151
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 11439
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Baskaran, M.M., Hartono, A., Tavarageri, S., Henretty, T., Ramanujam, J., Sadayappan, P.: Parameterized tiling revisited. CGO 2010 (2010)

    Google Scholar 

  2. Bondhugula, U., Bandishti, V., Pananilath, I.: Diamond tiling: tiling techniques to maximize parallelism for stencil computations. In: IEEE TPDS, pp. 1285–1298 (2017)

    Google Scholar 

  3. Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: SIGPLAN, vol. 43, no. 6 (2008)

    Google Scholar 

  4. Chatterjee, S., Parker, E., Hanlon, P.J., Lebeck, A.R.: Exact analysis of the cache behavior of nested loops. SIGPLAN Not. 36(5), 286–297 (2001)

    Article  Google Scholar 

  5. Cohen, A., Zhao, J.: Flextended tiles: a flexible extension of overlapped tiles for polyhedral compilation. ACM TACO (2020)

    Google Scholar 

  6. Hammami, E., Slama, Y.: An overview on loop tiling techniques for code generation. In: 2017 IEEE/ACS AICCSA, pp. 280–287 (2017)

    Google Scholar 

  7. Hartono, A., et al.: Parametric multi-level tiling of imperfectly nested loops. In: ICS 2009, NY, USA, p. 147–157. New York (2009)

    Google Scholar 

  8. Hsu, C.H., Kremer, U.: A quantitative analysis of tile size selection algorithms. J. Supercomput. 27(3), 279–294 (2004)

    Article  Google Scholar 

  9. Kelefouras, V., Djemame, K.: A methodology correlating code optimizations with data memory accesses, execution time and energy consumption. J. Supercomput. 75(10), 6710–6745 (2019). https://doi.org/10.1007/s11227-019-02880-z

    Article  Google Scholar 

  10. Kelefouras, V.I., Athanasiou, G.S., Alachiotis, N., Michail, H.E., Kritikakou, A.S., Goutis, C.E.: A methodology for speeding up fast Fourier transform focusing on memory architecture utilization. IEEE Trans. Sig. Process. 59, 6217–6226 (2011)

    Article  MathSciNet  Google Scholar 

  11. Kelefouras, V., Georgios, K., Nikolaos, V.: Combining software cache partitioning and loop tiling for effective shared cache management. ACM Trans. Embed. Comput. Syst. 17(3), 72:1-72:25 (2018)

    Google Scholar 

  12. Kelefouras, V., Kritikakou, A., Mporas, I., Kolonias, V.: A high-performance matrix–matrix multiplication methodology for CPU and GPU architectures. J. Supercomput. 72(3), 804–844 (2016)

    Article  Google Scholar 

  13. Kelefouras, V., Kritikakou, A., Papadima, E., Goutis, C.: A methodology for speeding up matrix vector multiplication for single/multi-core architectures. J. Supercomput. 71(7), 2644–2667 (2015). https://doi.org/10.1007/s11227-015-1409-9

    Article  Google Scholar 

  14. Kelefouras, V., Kritikakou, A., Goutis, C.: A matrix–matrix multiplication methodology for single/multi-core architectures using SIMD. J. Supercomput. 68(3), 1418–1440 (2014). https://doi.org/10.1007/s11227-014-1098-9

    Article  Google Scholar 

  15. Li, R., et al.: Analytical cache modeling and tilesize optimization for tensor contractions. In: SC 2019 (2019)

    Google Scholar 

  16. Mehta, S., Beeraka, G., Yew, P.C.: Tile size selection revisited. ACM Trans. Archit. Code Optim. 10(4), 1–27 (2013)

    Article  Google Scholar 

  17. Nethercote, N., Walsh, R., Fitzhardinge, J.: Building workload characterization tools with valgrind. In: IISWC, p. 2. IEEE Computer Society (2006)

    Google Scholar 

  18. POUCHET, L.: Polybench/c. http://web.cse.ohio-state.edu/~pouchet.2/software/polybench/. Accessed 10 Oct 2020

  19. Renganarayanan, L., Kim, D., Strout, M.M., Rajopadhye, S.: Parameterized loop tiling. ACM Trans. Program. Lang. Syst. 34(1), 1–41 (2012)

    Article  Google Scholar 

  20. Sarkar, V., Megiddo, N.: An analytical model for loop tiling and its solution. In: IEEE ISPASS, pp. 146–153 (2000)

    Google Scholar 

  21. Sato, Y., Yuki, T., Endo, T.: An autotuning framework for scalable execution of tiled code via iterative polyhedral compilation. In: ACM TACO (2019)

    Google Scholar 

  22. Shirako, J., et al.: Analytical bounds for optimal tile size selection. In: CC 2012 (2012)

    Google Scholar 

  23. Stoltzfus, L., Hagedorn, B., Steuwer, M., Gorlatch, S., Dubach, C.: Tiling optimizations for stencil computations using rewrite rules in lift. ACM Trans. Archit. Code Optim. 16(4), 1–25 (2019)

    Article  Google Scholar 

  24. Tavarageri, S., Pouchet, L.N., Ramanujam, J., Rountev, A., Sadayappan, P.: Dynamic selection of tile sizes. In: HIPC 2011 (2011)

    Google Scholar 

  25. Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27(1–2), 3–35 (2001)

    Article  Google Scholar 

  26. Zhou, X., Giacalone, J.P., Garzarán, M.J., Kuhn, R.H., Ni, Y., Padua, D.: Hierarchical overlapped tiling. In: CGO 2012 (2012)

    Google Scholar 

Download references

Acknowledgements

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No 957210 - XANDAR: X-by-Construction Design framework for Engineering Autonomous & Distributed Real-time Embedded Software Systems.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vasilios Kelefouras .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kelefouras, V., Djemame, K., Keramidas, G., Voros, N. (2022). An Analytical Model for Loop Tiling Transformation. In: Orailoglu, A., Jung, M., Reichenbach, M. (eds) Embedded Computer Systems: Architectures, Modeling, and Simulation. SAMOS 2021. Lecture Notes in Computer Science, vol 13227. Springer, Cham. https://doi.org/10.1007/978-3-031-04580-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-04580-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-04579-0

  • Online ISBN: 978-3-031-04580-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics