Abstract
The OpenSHMEM 1.4 specification recently introduced support for multithreaded hybrid programming and a new communication management API. Together, these features enable users to manage communications performed by multiple threads within an OpenSHMEM process and to overlap communication and computation to hide costly latencies. In order to realize these benefits, OpenSHMEM implementations must efficiently map this broad new space of usage models to the underlying fabric. This paper presents an implementation of OpenSHMEM 1.4 for the Intel®Omni-Path Fabric 100 Series. The OpenFabrics Interfaces (OFI) libfabric is used as the low-level fabric API; we identify strategies for effectively managing shared transmission resources using libfabric, as well as for managing the communication requirements of the Omni-Path fabric. We study the performance of our implementation, identify design tradeoffs that are influenced by application behavior, and explore application-level optimizations that can be used to achieve the best performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Other names and brands may be claimed as the property of others.
- 2.
Intel and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries.
Benchmark results were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as “Spectre” and “Meltdown”. Implementation of these updates may make these results inapplicable to your device or system.
Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark (see footnote 1), are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
For more information go to http://www.intel.com/benchmarks.
References
Barrett, B.W., Brigthwell, R., Hemmert, K.S., Pedretti, K., Wheeler, K., Underwood, K.D.: Enhanced support for OpenSHMEM communication in portals. In: IEEE 19th Annual Symposium on High Performance Interconnects. HotI, August 2011
Barrett, B.W., et al.: The Portals 4.0 network programming interface. Sandia National Laboratories, November 2012, Technical report SAND2012-10087 (2012)
Birrittella, M.S., et al.: Intel® Omni-path architecture: enabling scalable, high performance fabrics. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 1–9, August 2015
Boehm, S., Pophale, S., Baker, M.B., Venkata, M.G.: Merged requests for better performance and productivity in multithreaded OpenSHMEM. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds.) OpenSHMEM 2017. LNCS, vol. 10679, pp. 35–49. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73814-7_3
Boehm, S., Pophale, S., Venkata, M.G.: Evaluating OpenSHMEM explicit remote memory access operations and merged requests. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 18–34. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_2
Bouteiller, A., Pophale, S., Boehm, S., Baker, M.B., Venkata, M.G.: Evaluating contexts in OpenSHMEM-X reference implementation. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds.) OpenSHMEM 2017. LNCS, vol. 10679, pp. 50–62. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73814-7_4
Boyle, P., Chuvelev, M., Cossu, G., Kelly, C., Lehner, C., Meadows, L.: Accelerating HPC codes on Intel® Omni-path architecture networks: from particle physics to machine learning. arXiv preprint arXiv:1711.04883 (2017)
Brightwell, R., Pedretti, K.: An intra-node implementation of OpenSHMEM using virtual address space mapping. In: Proceedings of the 5th International Conference on Partitioned Global Address Space Programming Models. PGAS 2011 (2011)
ten Bruggencate, M., Roweth, D., Oyanagi, S.: Thread-safe SHMEM extensions. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 178–185. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_13
Choi, S.E., Pritchard, H., Shimek, J., Swaro, J., Tiffany, Z., Turrubiates, B.: An implementation of OFI libfabric in support of multithreaded PGAS solutions. In: Proceedings of 9th International Conference on Partitioned Global Address Space Programming Models. PGAS, pp. 59–69, September 2015
Dinan, J., Flajslik, M.: Contexts: a mechanism for high throughput communication in OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, pp. 10:1–10:9. ACM, New York (2014)
Grossman, M., Doyle, J., Dinan, J., Pritchard, H., Seager, K., Sarkar, V.: Implementation and evaluation of OpenSHMEM contexts using OFI libfabric. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds.) OpenSHMEM 2017. LNCS, vol. 10679, pp. 19–34. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73814-7_2
Grossman, M., Kumar, V., Budimlić, Z., Sarkar, V.: Integrating asynchronous task parallelism with OpenSHMEM. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 3–17. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_1
Grun, P., et al.: A brief introduction to the OpenFabrics interfaces - a new network API for maximizing high performance application efficiency. In: 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, pp. 34–39, August 2015
Knaak, D., Namashivayam, N.: Proposing OpenSHMEM extensions towards a future for hybrid programming and heterogeneous computing. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 53–68. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_4
MPI Forum: MPI: a message-passing interface standard version 3.1. Technical report, University of Tennessee, Knoxville, June 2015
Namashivayam, N., Knaak, D., Cernohous, B., Radcliffe, N., Pagel, M.: An evaluation of thread-safe and contexts-domains features in cray SHMEM. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 163–180. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_11
Intel® Omni-Path Fabric Host Software User Guide Rev 9.0, April 2018. https://www.intel.com/content/dam/support/us/en/documents/network-and-i-o/fabric-products/Intel_OP_Fabric_Host_Software_UG_H76470_v9_0.pdf
OpenSHMEM Application Programming Interface, Version 1.4, December 2017. http://openshmem.org/site/sites/default/site_files/OpenSHMEM-1.4.pdf
Intel® Performance Scaled Messaging 2 (PSM2) Programmer’s Guide, April 2018. https://www.intel.com/content/dam/support/us/en/documents/network-and-i-o/fabric-products/Intel_PSM2_PG_H76473_v9_0.pdf
Poole, S., et al.: OpenSHMEM extensions and a vision for its future direction. In: Poole, S., Hernandez, O., Shamis, P. (eds.) OpenSHMEM 2014. LNCS, vol. 8356, pp. 149–162. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-05215-1_11
Sandia OpenSHMEM (2018). https://github.com/Sandia-OpenSHMEM/SOS
Seager, K., Choi, S.-E., Dinan, J., Pritchard, H., Sur, S.: Design and implementation of OpenSHMEM using OFI on the aries interconnect. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 97–113. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_7
The high-resolution timer API, 16 January 2006. https://lwn.net/Articles/167897/
Weeks, H., Dosanjh, M.G.F., Bridges, P.G., Grant, R.E.: SHMEM-MT: a benchmark suite for assessing multi-threaded SHMEM performance. In: Gorentla Venkata, M., Imam, N., Pophale, S., Mintz, T.M. (eds.) OpenSHMEM 2016. LNCS, vol. 10007, pp. 227–231. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50995-2_16
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ozog, D., Rahman, M.Wu., Seager, K., Dinan, J. (2019). Design and Optimization of OpenSHMEM 1.4 for the Intel® Omni-Path Fabric 100 Series. In: Pophale, S., Imam, N., Aderholdt, F., Gorentla Venkata, M. (eds) OpenSHMEM and Related Technologies. OpenSHMEM in the Era of Extreme Heterogeneity. OpenSHMEM 2018. Lecture Notes in Computer Science(), vol 11283. Springer, Cham. https://doi.org/10.1007/978-3-030-04918-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-04918-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04917-1
Online ISBN: 978-3-030-04918-8
eBook Packages: Computer ScienceComputer Science (R0)