Abstract
The ocean general circulation model (OGCM) is an essential tool for researching oceanography and atmospheric science. The LASG/IAP climate system ocean model version 3 (LICOM3) is a parallel version of the OGCM. Our goal is to implement and optimize a GPU version of LICOM3 based on compute unified device architecture (CUDA) called LICOM3-CUDA. Considering the characteristics of LICOM3 and CUDA, we design and implement some pivotal optimization methods, including redesigning the numerical algorithms of complicated functions, decoupling data dependency, avoiding memory write conflicts, and optimizing communication. In this paper, we selected two experiments, including 1\(^{\circ }\) (small-scale) and 0.1\(^{\circ }\) (large-scale) resolutions to evaluate the performance of LICOM3-CUDA. Under the experimental environment of two Intel Xeon Gold 6148 CPUs and four NVIDIA Quadro GV100s, the LICOM3-CUDA (1\(^{\circ }\)) achieves a simulation speed of 114.3 simulation-year-per-day (SYPD). Compare with the performance of LICOM3, the LICOM3-CUDA can run much faster with 6.5 times, and the compute-intensive module achieves over 70\(\times\) speedup. In addition, the energy consumption for the simulation year is reduced by 41.3%. As for high-resolution and large-scale simulation, the number of GPUs increased from 96 to 1536 as well as the LICOM3-CUDA (0.1\(^{\circ }\)) time consumption decreased from 3261 to 720 seconds with approximately 4.5\(\times\) of speedup.
Similar content being viewed by others
Data and code availability
the model code (LICOM3-CUDA v1.0) along with the paper data, dataset and a 100 km (1\(^{\circ }\)) case can be downloaded from the website https://zenodo.org/record/7440403 (last access: 15 December 2022) [42].
References
Lazo JK, Lawson M, Larsen PH, Waldman DM (2011) U.S. economic sensitivity to weather variability. Bull Am Meteorol Soc 92(6):709–720. https://doi.org/10.1175/2011BAMS2928.1
Schär C, Fuhrer O, Arteaga A, Ban N, Charpilloz C, Girolamo SD, Hentgen L, Hoefler T, Lapillonne X, Leutwyler D, Osterried K, Panosetti D, Rüdisühli S, Schlemmer L, Schulthess TC, Sprenger M, Ubbiali S, Wernli H (2020) Kilometer-scale climate models: prospects and challenges. Bull Am Meteorol Soc 101(5):567–587. https://doi.org/10.1175/BAMS-D-18-0167.1
Khan HN, Hounshell DA, Fuchs ER (2018) Science and research policy at the end of moore’s law (vol 1, pg 14, 2018). Nat Electron 1(2):146–146. https://doi.org/10.1038/s41928-017-0005-9
Frank DJ, Dennard RH, Nowak E, Solomon PM, Taur Y, Wong H-SP (2001) Device scaling limits of si mosfets and their application dependencies. Proc IEEE 89(3):259–288. https://doi.org/10.1109/5.915374
Bauer P, Dueben PD, Hoefler T, Quintino T, Schulthess TC, Wedi NP (2021) The digital revolution of earth-system science. Nat Comput Sci 1(2):104–113. https://doi.org/10.1038/s43588-021-00023-0
Michalakes J, Vachharajani M (2008) Gpu acceleration of numerical weather prediction. Parallel Process Lett 18(04):531–548. https://doi.org/10.1142/S0129626408003557
Wang Y, Jiang J, Zhang H, Dong X, Wang L, Ranjan R, Zomaya AY (2017) A scalable parallel algorithm for atmospheric general circulation models on a multi-core cluster. Futur Gener Comput Syst 72:1–10. https://doi.org/10.1016/j.future.2017.02.008
TOP 500 NOVEMBER 2021. https://www.top500.org/lists/top500/2021/11/ Accessed 23 January 2022
Zhao W-L, Wang W, Wang Q (2022) Optimization of cosmological n-body simulation with fmm-pm on simt accelerators. J Supercomput 78(5):7186–7205. https://doi.org/10.1007/s11227-021-04153-0
Sojoodi AH, Salimi Beni M, Khunjush F (2021) Ignite-gpu: a gpu-enabled in-memory computing architecture on clusters. J Supercomput 77(3):3165–3192
Rani S, Gupta O (2017) Clus_gpu-blastp: accelerated protein sequence alignment using gpu-enabled cluster. J Supercomput 73(10):4580–4595
Bleichrodt F, Bisseling RH, Dijkstra HA (2012) Accelerating a barotropic ocean model using a gpu. Ocean Model 41:16–21
Chen B, Zhu J, Li L (2012) Accelerating 3d ocean model development by using gpu computing. In: Deng W (ed) Futur Control Autom. Springer, Berlin, Heidelberg, pp 37–43
Yamagishi T, Matsumura Y (2016) Gpu acceleration of a non-hydrostatic ocean model with a multigrid poisson/helmholtz solver. Procedia Computer Science 80:1658–1669. https://doi.org/10.1016/j.procs.2016.05.502. International Conference on Computational Science 2016, ICCS 2016, 6-8 June 2016, San Diego, California, USA
Zhao X-d, Liang S-x, Sun Z-c, Zhao X-z, Sun J-w, Liu Z-b (2017) A gpu accelerated finite volume coastal ocean model. J Hydrodyn, Ser. B 29(4):679–690. https://doi.org/10.1016/S1001-6058(16)60780-1
Panzer I, Lines S, Mak J, Choboter P, Lupo C (2013) High performance regional ocean modeling with gpu acceleration. In: 2013 OCEANS - San Diego, 1–4. https://doi.org/10.23919/OCEANS.2013.6741366
Mak J, Choboter P, Lupo C (2011) Numerical ocean modeling and simulation with cuda. In: OCEANS’11 MTS/IEEE KONA, 1–6. https://doi.org/10.23919/OCEANS.2011.6107199
Xu S, Huang X, Oey L-Y, Xu F, Fu H, Zhang Y, Yang G (2015) Pom.gpu-v1.0: a gpu-based princeton ocean model. Geosci Model Dev 8(9):2815–2827. https://doi.org/10.5194/gmd-8-2815-2015
Jiang J, Lin P, Wang J, Liu H, Chi X, Hao H, Wang Y, Wang W, Zhang L (2019) Porting lasg/ iap climate system ocean model to gpus using openacc. IEEE Access 7:154490–154501. https://doi.org/10.1109/ACCESS.2019.2932443
Wang P, Jiang J, Lin P, Ding M, Wei J, Zhang F, Zhao L, Li Y, Yu Z, Zheng W, Yu Y, Chi X, Liu H (2021) The gpu version of lasg/iap climate system ocean model version 3 (licom3) under the heterogeneous-compute interface for portability (hip) framework and its large-scale application. Geosci Model Dev 14(5):2781–2799. https://doi.org/10.5194/gmd-14-2781-2021
Xuehong Z, Xinzhong L (1989) A numerical world ocean general circulation model. Adv Atmos Sci 6(1):44–61. https://doi.org/10.1007/BF02656917
Liu H, Lin P, Yu Y, Zhang X (2012) The baseline evaluation of lasg/iap climate system ocean model (licom) version 2. Acta Meteorol Sin 26(3):318–329. https://doi.org/10.1007/s13351-012-0305-y
Madec G, Imbard M (1996) A global ocean mesh to overcome the north pole singularity. Climate Dyn 12(6):381–388. https://doi.org/10.1007/BF00211684
Murray RJ (1996) Explicit generation of orthogonal grids for ocean models. J Comput Phys 126(2):251–273. https://doi.org/10.1006/jcph.1996.0136
St LL, Simmons H, Jayne S (2002) Estimates of tidally driven enhanced mixing in the deep ocean. Geophys Res Lett 29:2106. https://doi.org/10.1029/2002GL015633
Ferreira D, Marshall J, Heimbach P (2005) The annual cycle of the global ocean circulation as determined by 4d-var data assimilation. JPO 35:1891–1910. https://doi.org/10.1175/JPO2785.1
Lin P, Liu H, Xue W, Li H, Jiang J, Song M, Song Y, Wang F, Zhang M (2016) A coupled experiment with licom2 as the ocean component of cesm1. J Meteorol Res 30(1):76–92. https://doi.org/10.1007/s13351-015-5045-3
McCartney MS, Talley LD (1982) The subpolar mode water of the north Atlantic ocean. J Phys Oceanogr 12(11):1169–1188. https://doi.org/10.1175/1520-0485(1982)012$<$1169:TSMWOT$>$2.0.CO;2
Gent PR, Mcwilliams JC (1990) Isopycnal mixing in ocean circulation models. J Phys Oceanogr 20(1):150–155. https://doi.org/10.1175/1520-0485(1990)020$<$0150:IMIOCM$>$2.0.CO;2
Lin P, Yu Z, Liu H, Yu Y, Li Y, Jiang J, Xue W, Chen K, Yang Q, Zhao B et al (2020) Licom model datasets for the cmip6 ocean model intercomparison project. Adv Atmos Sci 37(3):239–249. https://doi.org/10.1007/s00376-019-9208-5
Li Y, Liu H, Ding M, Lin P, Yu Z, Yu Y, Meng Y, Li Y, Jian X, Jiang J et al (2020) Eddy-resolving simulation of cas-licom3 for phase 2 of the ocean model intercomparison project. Adv Atmos Sci 37(10):1067–1080. https://doi.org/10.1007/s00376-020-0057-z
Zhang (2020) Cas-esm 2: description and climate simulation performance of the chinese academy of sciences (cas) earth system model (esm) version 2. J Adv Model Earth Syst. https://doi.org/10.1029/2020MS002210
Wang T, Jiang J, Zhang M, Zhang H, He J, Hao H, Chi X (2020) Design and research of cas-cig for earth system models. Earth and Space Science 7(7):2019–000965 https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2019EA000965. https://doi.org/10.1029/2019EA000965. e2019EA000965 2019EA000965
Liu H, Lin P, Zheng W, Luan Y, Ma J, Ding M, Mo H, Wan L, Ling T (2021) A global eddy-resolving ocean forecast system in china - licom forecast system (lfs). J Oper Oceanogr. https://doi.org/10.1080/1755876X.2021.1902680
Kerbyson DJ, Jones PW (2005) A performance model of the parallel ocean program. Int J High Perform Comput Appl 19(3):261–276
NVIDIA: CUDA C Programming Guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Accessed 15 August 2021
Henderson T, Middlecoff J, Rosinski J, Govett M, Madden P (2011) Experience applying fortran gpu compilers to numerical weather prediction. In: 2011 Symposium on Application Accelerators in High-Performance Computing. 34–41 https://doi.org/10.1109/SAAHPC.2011.9
AMD: HIP Programming Guide. https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html Accessed 15 August 2021
Harris M (2021) How to Optimize Data Transfers in CUDA C/C++. https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/ Accessed 15 August
Rucong Y (1994) A two-step shape-preserving advection scheme. Adv Atmos Sci 11(4):479–490
NVIDIA (2021): NVIDIA Collective Communication Library (NCCL) Documentation. https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/index.html Accessed 15 August
JunlinWei, Jiang J, Liu H, Zhang F, Lin P, Wang P, Yu Y, Chi X, Zhao L, Ding M, Li Y, Yu Z, Zheng W, Wang Y (2022) LICOM3-CUDA: a GPU Version of LASG/IAP Climate System Ocean Model Version 3 Based on CUDA. https://doi.org/10.5281/zenodo.7440403
Large WG, Yeager SG (2009) The global climatology of an interannually varying air–sea flux data set. Climate Dynamics 33(2-3):341–364. https://doi.org/10.1007/s00382-008-0441-3
Redi MH (1982) Oceanic isopycnal mixing by coordinate rotation. J Phys Oceanogr 12(10):1154–1158
Fox-Kemper B, Menemenlis D (2008) Can large eddy simulation techniques improve mesoscale rich ocean models? In: Hecht W, Hasumi H (eds) Ocean modeling in an eddying regime. Geophysical Monograph Series, vol 177, American Geophysical Union, Washington DC, pp 319–337. https://doi.org/10.1029/177GM19
Acknowledgements
The study is funded by the National Natural Sciences Foundation (41931183), the National Key Research and Development Program (2016YFB0200800), and the “Earth System Science Numerical Simulator Facility” (EarthLab).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest. The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wei, J., Jiang, J., Liu, H. et al. LICOM3-CUDA: a GPU version of LASG/IAP climate system ocean model version 3 based on CUDA. J Supercomput 79, 9604–9634 (2023). https://doi.org/10.1007/s11227-022-05020-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-05020-2