Abstract
High performance and efficiency for parallel computing has significance in large scale discrete element method (DEM) simulation. After analyzing a simulation framework of DEM built on a Graphic Processor Unit (GPU) platform with CUDA architecture and evaluating the simulated data, we propose three optimization methods to improve the performance of a system. A stencil computation model is applied to the particle searching and calculation of forces based on gridding to formulate the structure in the particle-particle contact and neighboring particle searching. In addition, a reasonable and effective parallel granularity is sought out by altering the number of blocks and threads on GPU. A shared-memory environment is set up for data prefetching and storing the results of intermediate calculations by a rational analysis and calculations. The results of the experiment show that the stencil model is useful for the particle searching and calculation of forces and the rational parallel granularity as well as the fair use of shared memory optimizes the performance of the DEM simulation framework.
This work was supported by the National Natural Science Foundation of China (No. 11372067).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Radeke, C.A., Glasser, B.J., Khinast, J.G.: Large-scale powder mixer simulations using massively parallel GPU architectures. Chem. Eng. Sci. 65, 6435 (2010)
Hazeghian, M., Soroush, A.: DEM simulation of reverse faulting through sands with the aid of GPU computing. Comput. Geotech. 66, 253 (2015)
Guanghao, J., Toshio, E., Satoshi, M.: A Multi-level Optimization Method for Stencil Computation on the Domain that is Bigger than Memory Capacity of GPU (2013). doi:10.1109/IPDPSW.2013.58
Hori, C., Gotoh, H., Ikari, H., Khayyer, A.: GPU-acceleration for moving particle semi-implicit method. Comput. Fluids 51, 174 (2011)
Owens, J.D., Houston, M., Luebke, D., Green, S., Stone, J.E., Phillips, J.C.: GPU computing. Proc. IEEE 96, 879 (2008)
Shigeto, Y., Sakai, M.: Parallel computing of discrete element method on multi-core processors. Particuology 9, 398 (2011)
Yangtong, X., Haohuan, F., Lin, G., Xinliang, W., Yuchen, Q., Peng, H., Wei, X., Chao, Y.: Performance Optimization and Analysis for Different Stencil Kernels on Multi-Core and Many-Core Architectures. HPC China 2013. Guilin, 628 p. (2013)
Wang, G., Yang, X., Zhang, Y., Tang, T., Fang, X.: Program optimization of stencil based application on the gpu-accelerated system. In: 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Xue, R., Wang, Y., Guo, H., Zhang, C., Ji, S. (2016). Performance Optimization of a DEM Simulation Framework on GPU Using a Stencil Model. In: Xie, J., Chen, Z., Douglas, C., Zhang, W., Chen, Y. (eds) High Performance Computing and Applications. HPCA 2015. Lecture Notes in Computer Science(), vol 9576. Springer, Cham. https://doi.org/10.1007/978-3-319-32557-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-32557-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32556-9
Online ISBN: 978-3-319-32557-6
eBook Packages: Computer ScienceComputer Science (R0)