Abstract
In the semi-Lagrangian interpolation scheme of Yin-he Global Spectral model (YHGSM), communication needs to be performed before interpolation, resulting in significant communication overhead. To solve this problem, we propose an optimized scheme that overlaps computation with communication based on grouping levels. The scheme divides the vertical levels into three groups and overlaps the computation of one group with communication of another. Experimental results show that our scheme can reduce the running time for the semi-Lagrangian scheme by 12.5% and effectively reduce the communication overhead of the model, improving the efficiency of YHGSM.














Similar content being viewed by others
Data availability
The data are available from the corresponding author on reasonable request.
References
Berrang-Ford, L., Biesbroek, R., Ford, J.D., Lesnikowski, A., Tanabe, A., Wang, F.M., Chen, C., Hsu, A., Hellmann, J.J., Pringle, P., et al.: Tracking global climate change adaptation among governments. Nat. Clim. Chang. 9(6), 440–449 (2019)
Cruz, R., Folch, A., Farré, P., Cabezas, J., Navarro, N., Cela, J.M.: Optimization of atmospheric transport models on hpc platforms. Comput. Geosci. 97, 30–39 (2016)
Folch, A., Costa, A., Macedonio, G.: Fall3d: A computational model for transport and deposition of volcanic ash. Comput. & Geosci. 35(6), 1334–1342 (2009)
Jiang, T., Guo, P., Wu, J.: One-sided on-demand communication technology for the semi-lagrange scheme in the yhgsm. Concurr. Comput. Pract. Experience 32(7), 5586 (2020)
Jiang, T., Wu, J., Liu, Z., Zhao, W., Zhang, Y.: Optimization of the parallel semi-lagrangian scheme based on overlapping communication with computation in the yhgsm. Q. J. R. Meteorol. Soc. 147(737), 2293–2302 (2021)
Khaneghah, E.M., Mirtaheri, S.L., Sharifi, M.: Evaluating the effect of inter process communication efficiency on high performance distributed scientific computing. In: 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, vol. 1, pp. 366–372 (2008). IEEE
Liu, D., Wu, J., Jiang, T., Wang, Y., Pan, X., Li, P.: Optimization of the parallel semi-lagrangian scheme in the yhgsm based on the adaptive maximum wind speed. In: 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), pp. 1336–1344 (2021). IEEE
Malakar, P., Saxena, V., George, T., Mittal, R., Kumar, S., Naim, A.G., Husain, S.A.b.H.: Performance evaluation and optimization of nested high resolution weather simulations. In: Euro-Par 2012 Parallel Processing: 18th International Conference, Euro-Par 2012, Rhodes Island, Greece, August 27-31, 2012. Proceedings 18, pp. 805–817 (2012). Springer
Molteni, F., Buizza, R., Palmer, T.N., Petroliagis, T.: The ecmwf ensemble prediction system: Methodology and validation. Q. J. R. Meteorol. Soc. 122(529), 73–119 (1996)
Mozdzynski, G., Hamrud, M., Wedi, N., Doleschal, J., Richardson, H.: A pgas implementation by co-design of the ecmwf integrated forecasting system (ifs). In: 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pp. 652–661 (2012). IEEE
Noronha, R., Panda, D.K.: Performance evaluation of mm5 on clusters with modern interconnects: scalability and impact. In: Euro-Par 2005 Parallel Processing: 11th International Euro-Par Conference, Lisbon, Portugal, August 30-September 2, 2005. Proceedings 11, pp. 134–145 (2005). Springer
Shimokawabe, T., Aoki, T., Muroi, C., Ishida, J., Kawano, K., Endo, T., Nukada, A., Maruyama, N., Matsuoka, S.: An 80-fold speedup, 15.0 tflops full gpu acceleration of non-hydrostatic weather model asuca production code. In: SC’10: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010). IEEE
Skamarock, W.C., Klemp, J.B., Dudhia, J., Gill, D.O., Liu, Z., Berner, J., Wang, W., Powers, J.G., Duda, M.G., Barker, D.M., et al.: A description of the advanced research wrf model version 4. National Center for Atmospheric Research: Boulder, CO, USA 145(145), 550 (2019)
Thomas, S., Côté, J.: Massively parallel semi-Lagrangian advection. Simul. Pract. Theory 3(4–5), 223–238 (1995)
Vivoda, J., Smolíková, P., Simarro, J.: Finite elements used in the vertical discretization of the fully compressible core of the aladin system. Mon. Weather Rev. 146(10), 3293–3310 (2018)
Xue, W., Yang, C., Fu, H., Wang, X., Xu, Y., Liao, J., Gan, L., Lu, Y., Ranjan, R., Wang, L.: Ultra-scalable cpu-mic acceleration of mesoscale atmospheric modeling on tianhe-2. IEEE Trans. Comput. 64(8), 2382–2393 (2014)
Yang, C., Xue, W., Fu, H., Gan, L., Li, L., Xu, Y., Lu, Y., Sun, J., Yang, G., Zheng, W.: A peta-scalable cpu-gpu algorithm for global atmospheric simulations. ACM SIGPLAN Notices 48(8), 1–12 (2013)
Yang, C., Xue, W., Fu, H., You, H., Wang, X., Ao, Y., Liu, F., Gan, L., Xu, P., Wang, L., et al.: 10m-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. In: SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 57–68 (2016). IEEE
Yang, J., Zhang, X., Li, S., Song, J., Wang, H., Zhang, W., Sun, D.: Performance and validation of the yhgsm global spectral model coupled with the wam model. Q J R Meteorol Soc 149(754), 1690–1703 (2023)
Acknowledgements
This research is funded by the National Natural Science Foundation of China (41875121).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, D., Liu, W., Pan, L. et al. Optimization of the parallel semi-Lagrangian scheme to overlap computation with communication based on grouping levels in YHGSM. CCF Trans. HPC 6, 68–77 (2024). https://doi.org/10.1007/s42514-023-00163-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-023-00163-x