Abstract
The performance of collective operations has a large impact on overall performance in many HPC applications. Implementing multiple algorithms and selecting optimal one depending on message size and the number of processes involved in the operation is essential to achieve good performance. In this paper, we will present SHCOLL, a collective routines library that was developed on top of OpenSHMEM API point to point operations: puts, gets, atomic memory update, and memory synchronization routines. The library is designed to serve as a plug-in to OpenSHMEM implementations and will be used by the OSSS OpenSHMEM reference implementation to support OpenSHMEM collective operations. In this paper, we describe the algorithms that have been incorporated in the implementation of each OpenSHMEM API collective routine and evaluate them on a Cray XC30 system. For long messages, SHCOLL shows an improvement by up to a factor of 12 compared to the vendor’s implementation. We also discuss future development of the library, as well as how it will be incorporated into the OSSS OpenSHMEM reference implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Introduction to barrier algorithms. https://6xq.net/barrier-intro/
MPICH. https://www.mpich.org
MVAPICH2-X. http://mvapich.cse.ohio-state.edu/
PMIx Reference RunTime Environment. https://github.com/pmix/prrte
Awan, A.A., Hamidouche, K., Chu, C.H., Panda, D.: A case for non-blocking collectives in OpenSHMEM: design, implementation, and performance evaluation using MVAPICH2-X. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 69–86. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_5
Barnett, M., Shuler, L., van De Geijn, R., Gupta, S., Payne, D.G., Watts, J.: Interprocessor collective communication library (intercom). In: Proceedings of the Scalable High-Performance Computing Conference, pp. 357–364. IEEE (1994)
Bauer, M.E.: Legion: programming distributed heterogeneous architectures with logical regions (2014)
Bonachea, D.: GASNet specification, v1.1. Technical report, Computer Science Department, University of California, Berkeley (2002)
Bruck, J., Ho, C.T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distrib. Syst. 8(11), 1143–1156 (1997)
ten Buggencate, M., Roweth, D.: DMAPP: an API for one-sided programming models on baker systems. In: Proceedings of Cray User Group (2010)
Castain, R.H., Solt, D., Hursey, J., Bouteiller, A.: Pmix: process management for exascale environments. In: Proceedings of the 24th European MPI Users’ Group Meeting, EuroMPI 2017, pp. 14:1–14:10. ACM, New York (2017). http://doi.acm.org/10.1145/3127024.3127027
Chapman, B., et al.: Introducing OpenSHMEM: SHMEM for the PGAS community. In: Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, PGAS 2010, pp. 2:1–2:3. ACM, New York (2010). http://doi.acm.org/10.1145/2020373.2020375
Cray, Inc.: Chapel Language Specification. Technical report, Cray, Inc. (2010)
Cray Inc.: Using the GNI and DMAPP APIs (2011)
Dinan, J., Cole, C., Jost, G., Smith, S., Underwood, K., Wisniewski, R.W.: Reducing synchronization overhead through bundled communication. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 163–177. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_12
Faanes, G., et al.: Cray cascade: a scalable HPC system based on a dragonfly network. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC 2012), November 2012
Jose, J., Kandalla, K., Zhang, J., Potluri, S., Panda, D.: Optimizing collective communication in openshmem. In: 7th International Conference on PGAS Programming Models, p. 185 (2013)
Knaak, D., Namashivayam, N.: Proposing OpenSHMEM extensions towards a future for hybrid programming and heterogeneous computing. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 53–68. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_4
Namashivayam, N., Eachempati, D., Khaldi, D., Chapman, B.M.: OpenSHMEM as a portable communication layer for PGAS models: a case study with coarray fortran. In: 2015 IEEE International Conference on Cluster Computing, CLUSTER 2015, Chicago, IL, USA, 8–11 September 2015, pp. 438–447 (2015). http://dx.doi.org/10.1109/CLUSTER.2015.66
OpenSHMEM Specification Committee: OpenSHMEM Specification. http://www.openshmem.org/site/Specification
Poole, S.W., Hernandez, O., Kuehn, J.A., Shipman, G.M., Curtis, A., Feind, K.: OpenSHMEM - toward a unified RMA model. In: Padua, D. (ed.) Encyclopedia of Parallel Computing. Springer, Boston (2011). https://doi.org/10.1007/978-0-387-09766-4_490
Rolf Rabenseifner: A new optimized MPI reduce algorithm. https://fs.hlrs.de/projects/par/mpi//myreduce.html
Chauvin, S., Saha, P., Cantonnet, F., Annareddy, S., El-Ghazawi, T.: UPC Manual (2003)
Shamis, P., et al.: UCX: an open source framework for HPC network APIS and beyond. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 40–43, August 2015
Tam, A., Wang, C.L.: Efficient scheduling of complete exchange on clusters. In: 13th International Conference on Parallel and Distributed Computing Systems (PDCS 2000), Las Vegas, vol. 4 (2000)
Thakur, R., Rabenseifner, R., Gropp, W.: Optimization of collective communication operations in MPICH. Int. J. High Perform. Comput. Appl. 19(1), 49–66 (2005)
Acknowledgments
This research was funded in part by the United States Department of Defense, and was supported by resources at Los Alamos National Laboratory. This publication has been approved for public, unlimited distribution by Los Alamos National Laboratory, with document number LA-UR-18-27273.
This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Milaković, S., Budimlić, Z., Pritchard, H., Curtis, A., Chapman, B., Sarkar, V. (2019). SHCOLL - A Standalone Implementation of OpenSHMEM-Style Collectives API. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity. OpenSHMEM 2018. Lecture Notes in Computer Science(), vol 11283. Springer, Cham. https://doi.org/10.1007/978-3-030-04918-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-04918-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04917-1
Online ISBN: 978-3-030-04918-8
eBook Packages: Computer ScienceComputer Science (R0)