计算机科学 ›› 2015, Vol. 42 ›› Issue (12): 18-22.
徐金龙,赵荣彩,徐晓燕
XU Jin-long, ZHAO Rong-cai and XU Xiao-yan
摘要: 向量程序来源于手工编写或由编译器自动生成。受限于编程人员和并行编译器的能力,得到的向量程序都存在一定的优化空间。优化编译器通常关注如何将串行程序向量化,但很少对向量程序进行优化。因此,提出了一种针对SIMD代码的向量访存优化方法。该方法首先分析程序是否需要优化,若存在需求,则对程序同时进行深度冗余优化和对齐优化。实验数据显示,提出的方法可以明显提高程序的运行效率,达到了目标。
[1] 李春江,黄娟娟,徐颖,等.典型编译器自动向量化效果评估与分析[J].计算机科学,2013,40(4):41-46 Li Chun-jiang,Huang Juan-juan,Xu Ying,et al.Evaluation and Analysis of Effects of Auto-vectorization in Typical Complier[J].Computer Science,2013,40(4):41-46 [2] Allen R,Kennedy K.现代体系结构的优化编译器[M].张兆庆,乔如良,冯晓兵,等译.北京:机械工业出版社,2004 Allen R,Kennedy K.Optimizing compilers modern architectures [M].Zhang Zhao-qin,Qiao Ru-liang,Feng Xiao-bing,et al,eds.Beijing:China Machine Press,2004 [3] Larsen S,Amarasinghe S.Exploiting superword level parallelism with multimedia instruction sets[C]∥Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.2000:145-156 [4] Boekhold M,Karkowski I,Corporaal H.Transforming and parallelizing ANSI C programs using pattern recognition[C]∥Lecture Notes in Computer Science.1999 [5] Manniesing R,Karkowski I,Corporaal H.Automatic SIMD parallelization of embedded applications based on pattern recognition[C]∥Proceedings of 6th International Euro-Par Confe-rence.2000:349-356 [6] Henretty T,Veras R,Franchetti F,et al.A stencil compiler for short-vector simd architectures[C]∥Proceedings of the 27th International ACM Conference on International Conference on Supercomputing.ACM,2013:13-24 [7] Kong M,Veras R,Stock K,et al.When polyhedral transformations meet SIMD code generation[J].ACM SIGPLAN Notices,2013,48(6):127-138 [8] Bondhugula U,Gunluk O,Dash S,et al.A model for fusion and code motion in an automatic parallelizing compiler[C]∥Proceedings of the 19th international conference on Parallel architectures and compilation techniques.ACM,2010:343-352 [9] Rosen I,Nuzman D,Zaks A.Loop-aware SLP in GCC[C]∥GCC summit.2007:131-142 [10] Nuzman D,Rosen I,Zaks A.Auto-vectorization of interleaveddata for SIMD[J]∥ACM SIGPLAN Notices,2006,41(6):132-143 [11] 何颂颂,顾乃杰,任开新.一种面向数据密集型应用的并行程序执行模型[J].小型微型计算机系统,2013,34(7):1457-1461 He Song-song,Gu Nai-jie,Ren Kai-xin.Parallel Program Execution Model for Data-intensive Applications[J].Journal of Chinese Computer Systems,2013,34(7):1457-1461 [12] Open64.Overview of the open64 Compiler Infrastructure[EB/OL].http://open64.sourceforge.net,2006 |
No related articles found! |
|