{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,23]],"date-time":"2024-08-23T20:40:07Z","timestamp":1724445607326},"reference-count":44,"publisher":"Wiley","issue":"11","license":[{"start":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T00:00:00Z","timestamp":1612396800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2021,6,10]]},"abstract":"Summary<\/jats:title>In this article, we introduce DiPOSH, a multi\u2010network, distributed implementation of the OpenSHMEM standard. The core idea behind DiPOSH is to have an API\u2010to\u2010network software stack as slim as possible, in order to minimize the software overhead. Following the heritage of its non\u2010distributed parent POSH, DiPOSH's communication engine is organized around the processes' shared heaps, and remote communications are moving data from and to these shared heaps directly. This article presents its architecture and several communication drivers, including one that takes advantage of a helper process, called the Hub, for inter\u2010process communications. This architecture allows use to explore different options for implementing the communication drivers, from using high\u2010level, portable, optimized libraries to low\u2010level, close to the hardware communication routines. We present the perspectives opened by this additional component in terms of communication scheduling between and on the nodes. DiPOSH is available athttps:\/\/github.com\/coti\/DiPOSH<\/jats:ext-link>.<\/jats:p>","DOI":"10.1002\/cpe.6179","type":"journal-article","created":{"date-parts":[[2021,2,8]],"date-time":"2021-02-08T02:17:32Z","timestamp":1612750652000},"update-policy":"http:\/\/dx.doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["DiPOSH: A portable OpenSHMEM implementation for short API\u2010to\u2010network path"],"prefix":"10.1002","volume":"33","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-1224-7786","authenticated-orcid":false,"given":"Camille","family":"Coti","sequence":"first","affiliation":[{"name":"LIPN, CNRS UMR 7030, Universit\u00e9 Sorbonne Paris Nord Villetaneuse France"},{"name":"University of Oregon Eugene Oregon USA"}]},{"given":"Allen D.","family":"Malony","sequence":"additional","affiliation":[{"name":"University of Oregon Eugene Oregon USA"}]}],"member":"311","published-online":{"date-parts":[[2021,2,4]]},"reference":[{"key":"e_1_2_7_2_1","article-title":"Shared memory access (SHMEM) routines","volume":"53","author":"Feind K","year":"1995","journal-title":"Cray Res"},{"key":"e_1_2_7_3_1","doi-asserted-by":"crossref","unstructured":"ChapmanB CurtisT PophaleS et al. Introducing OpenSHMEM: SHMEM for the PGAS community. Paper presented at: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model. New York;2010:1\u20103.https:\/\/doi.org\/10.1145\/2020373.2020375.","DOI":"10.1145\/2020373.2020375"},{"key":"e_1_2_7_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2014.05.226"},{"key":"e_1_2_7_5_1","doi-asserted-by":"crossref","unstructured":"ShamisP VenkataMG LopezMG et al. UCX: an open source framework for HPC network APIs and beyond. Paper presented at: Proceedings of the 2015 IEEE 23rd Annual Symposium on High\u2010Performance Interconnects. Santa Clara CA;2015:40\u201043; IEEE.","DOI":"10.1109\/HOTI.2015.13"},{"key":"e_1_2_7_6_1","doi-asserted-by":"crossref","unstructured":"CotiC MalonyAD. On the road to DiPOSH: adventures in high\u2010performance OpenSHMEM. Paper presented at: Proceedings of the 13th International Conference on Parallel Processing and Applied Mathematics (PPAM 2019);2019; Bialystok Poland.","DOI":"10.1007\/978-3-030-43229-4_22"},{"key":"e_1_2_7_7_1","doi-asserted-by":"crossref","unstructured":"PatinyasakdikulT EberiusD BosilcaG HjelmNT. Give MPI threading a fair chance: a study of multithreaded MPI designs. Paper presented at: Proceedings of the 2019 IEEE International Conference on Cluster Computing CLUSTER 2019; September 23\u201026 2019:1\u201011; Albuquerque NM.","DOI":"10.1109\/CLUSTER.2019.8891015"},{"key":"e_1_2_7_8_1","doi-asserted-by":"crossref","unstructured":"HammondJR GhoshS ChapmanBM. Implementing OpenSHMEM using MPI\u20103 one\u2010sided communication. Paper presented at: Proceedings of the Workshop on OpenSHMEM and Related Technologies;2014:44\u201058; Springer New York NY.","DOI":"10.1007\/978-3-319-05215-1_4"},{"key":"e_1_2_7_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2018.2815568"},{"key":"e_1_2_7_10_1","doi-asserted-by":"crossref","unstructured":"El\u2010GhazawiT SmithL. UPC: unified parallel C. Paper presented at: Proceedings of the 2006 ACM\/IEEE Conference on Supercomputing. Tampa FL;2006:27\u2013es.","DOI":"10.1145\/1188455.1188483"},{"key":"e_1_2_7_11_1","unstructured":"BonacheaD JeongJ. Gasnet: a portable high\u2010performance communication layer for global address\u2010space languages.CS258 Parallel Computer Architecture Project Spring;2002."},{"key":"e_1_2_7_12_1","unstructured":"MainwaringA CullerD. Active messages: organization and applications programming interface (API V2. 0). University of California at Berkeley Network of Workstations Project White Paper;1995."},{"key":"e_1_2_7_13_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJHPCN.2004.007569"},{"key":"e_1_2_7_14_1","doi-asserted-by":"crossref","unstructured":"JoseJ LuoM SurS PandaDK.Unifying UPC and MPI runtimes: experience with MVAPICH. Paper presented at: Proceedings of the 4th Conference on Partitioned Global Address Space Programming Model. New York NY;2010:1\u201010.","DOI":"10.1145\/2020373.2020378"},{"key":"e_1_2_7_15_1","doi-asserted-by":"crossref","unstructured":"KoopMJ JonesT PandaDK. MVAPICH\u2010Aptus: scalable high\u2010performance multi\u2010transport MPI over InfiniBand. Paper presented at: Proceeding of the 2008 IEEE International Symposium on Parallel and Distributed Processing. Toulouse France;2008:1\u201012; IEEE.","DOI":"10.1109\/IPDPS.2008.4536283"},{"key":"e_1_2_7_16_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:IJPP.0000029272.69895.c1"},{"key":"e_1_2_7_17_1","doi-asserted-by":"crossref","unstructured":"DinanJ BalajiP LuskE SadayappanP ThakurR.Hybrid parallel programming with MPI and unified parallel C. Paper presented at: Proceedings of the 7th ACM International Conference on Computing Frontiers. Bertinoro Italy;2010:177\u2010186.","DOI":"10.1145\/1787275.1787323"},{"key":"e_1_2_7_18_1","doi-asserted-by":"crossref","unstructured":"Gr\u00fcnewaldD. BQCD with GPI: a case study. Paper presented at: Proceedings of the 2012 International Conference on High Performance Computing Simulation (HPCS). Madrid Spain;2012:388\u2010394.","DOI":"10.1109\/HPCSim.2012.6266942"},{"key":"e_1_2_7_19_1","first-page":"160","article-title":"Hybrid\u2010parallel sparse matrix\u2010vector multiplication and iterative linear solvers with the communication library GPI","volume":"11","author":"Stoyanov D","year":"2014","journal-title":"WSEAS Trans Inf Sci Appl"},{"key":"e_1_2_7_20_1","doi-asserted-by":"crossref","unstructured":"AkhmetovaD CebamanosL IakymchukR et al. Interoperability of GASPI and MPI in large scale scientific applications. Paper presented at: Proceedings of the International Conference on Parallel Processing and Applied Mathematics2017:277\u2010287; Springer New York NY.","DOI":"10.1007\/978-3-319-78054-2_26"},{"key":"e_1_2_7_21_1","doi-asserted-by":"crossref","unstructured":"BreitbartJ SchmidtobreickM HeuvelineV. Evaluation of the global address space programming interface (GASPI). Paper presented at: Proceedings of the 2014 IEEE International Parallel Distributed Processing Symposium Workshops. Phoenix AZ;2014:717\u2010726.","DOI":"10.1109\/IPDPSW.2014.83"},{"key":"e_1_2_7_22_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4280"},{"key":"e_1_2_7_23_1","doi-asserted-by":"crossref","unstructured":"SalaK Bell\u00f3nJ Farr\u00e9P et al. Improving the interoperability between MPI and task\u2010based programming models. Paper presented at: Proceedings of the 25th European MPI Users' Group Meeting. Barcelona Spain;2018:1\u201011.","DOI":"10.1145\/3236367.3236382"},{"key":"e_1_2_7_24_1","doi-asserted-by":"crossref","unstructured":"LuoM SeagerK MurthyKS ArcherCJ SurS HeftyS.Early evaluation of scalable fabric interface for PGAS programming models. Paper presented at: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models;2014:1; ACM New York NY.","DOI":"10.1145\/2676870.2676871"},{"key":"e_1_2_7_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2012.09.016"},{"key":"e_1_2_7_26_1","doi-asserted-by":"crossref","unstructured":"AumageO BrunetE FurmentoN NamystR. NEW MADELEINE: a fast communication scheduling engine for high performance networks. Paper presented at: Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007). Long Beach CA; March 26\u201030 2007:1\u20108","DOI":"10.1109\/IPDPS.2007.370476"},{"key":"e_1_2_7_27_1","doi-asserted-by":"crossref","unstructured":"BarrettB SquyresJM LumsdaineA GrahamRL BosilcaG. Analysis of the component architecture overhead in Open MPI. Paper presented at: Proceedings of the European Parallel Virtual Machine\/Message Passing Interface Users' Group Meeting;2005:175\u2010182; Springer New York NY.","DOI":"10.1007\/11557265_25"},{"key":"e_1_2_7_28_1","doi-asserted-by":"crossref","unstructured":"IancuC HofmeyrS Blagojevi\u0107F ZhengY. Oversubscription on multicore processors. Paper presented at: Proceedings of the 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS). Atlanta GA;2010:1\u201011; IEEE.","DOI":"10.1109\/IPDPS.2010.5470434"},{"key":"e_1_2_7_29_1","doi-asserted-by":"crossref","unstructured":"HaoP ShamisP VenkataMG et al. Fault tolerance for openshmem. Paper presented at: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. Eugene OR;2014:1\u20103.","DOI":"10.1145\/2676870.2676894"},{"key":"e_1_2_7_30_1","doi-asserted-by":"crossref","unstructured":"ButelleF CotiC.Distributed snapshot for rollback\u2010recovery with one\u2010sided communications. Paper presented at: Proceedings of the 2018 International Conference on High Performance Computing & Simulation (HPCS). Orleans France;2018:614\u2010620; IEEE.","DOI":"10.1109\/HPCS.2018.00102"},{"key":"e_1_2_7_31_1","doi-asserted-by":"crossref","unstructured":"RoparsT MartsinkevichTV GuermoucheA SchiperA CappelloF. SPBC: leveraging the characteristics of MPI HPC applications for scalable checkpointing. Paper presented at: SC'13: Proceedings of the International Conference on High Performance Computing Networking Storage and Analysis. Denver CO;2013:1\u201012; IEEE.","DOI":"10.1145\/2503210.2503271"},{"key":"e_1_2_7_32_1","doi-asserted-by":"crossref","unstructured":"FerreiraKB WidenerP LevyS ArnoldD HoeflerT. Understanding the effects of communication and coordination on checkpointing at scale. Paper presented at: Proceedings of the SC'14 International Conference for High Performance Computing Networking Storage and Analysis. New Orleans LA;2014:883\u2010894.","DOI":"10.1109\/SC.2014.77"},{"key":"e_1_2_7_33_1","unstructured":"BlandW BosilcaG BouteillerA HeraultT DongarraJ. A proposal for user\u2010level failure mitigation in the MPI\u20103 standard. Department of Electrical Engineering and Computer Science University of Tennessee;2012."},{"key":"e_1_2_7_34_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342016677085"},{"key":"e_1_2_7_35_1","doi-asserted-by":"crossref","unstructured":"ShahzadF KreutzerM ZeiserT et al. Building a fault tolerant application using the GASPI communication layer. Paper presented at: Proceedings of the 2015 IEEE International Conference on Cluster Computing. Chicago IL;2015:580\u2010587; IEEE.","DOI":"10.1109\/CLUSTER.2015.106"},{"key":"e_1_2_7_36_1","doi-asserted-by":"crossref","unstructured":"LinfordJC KhuvisS ShendeS MalonyA ImamN VenkataMG. Performance analysis of OpenSHMEM applications with TAU commander. Paper presented at: Proceedings of the Workshop on OpenSHMEM and Related Technologies;2017:161\u2010179; Springer New York NY.","DOI":"10.1007\/978-3-319-73814-7_11"},{"key":"e_1_2_7_37_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342015600507"},{"key":"e_1_2_7_38_1","doi-asserted-by":"crossref","unstructured":"EberiusD PatinyasakdikulT BosilcaG. Using software\u2010based performance counters to expose low\u2010level open MPI performance information. Paper presented at: Proceedings of the 24th European MPI Users' Group Meeting. Chicago IL;2017:1\u20108.","DOI":"10.1145\/3127024.3127039"},{"key":"e_1_2_7_39_1","doi-asserted-by":"crossref","unstructured":"LinfordJC KhuvisS ShendeS MalonyA ImamN VenkataMG. Profiling production OpenSHMEM applications. Paper presented at: Proceedings of the Workshop on OpenSHMEM and Related Technologies;2016:219\u2010224; Springer New York NY.","DOI":"10.1007\/978-3-319-50995-2_15"},{"key":"e_1_2_7_40_1","doi-asserted-by":"crossref","unstructured":"BrunetE TrahayF DenisA NamystR. A sampling\u2010based approach for communication libraries auto\u2010tuning. Paper presented at: Proceedings of the 2011 IEEE International Conference on Cluster Computing. Austin TX;2011:299\u2010307; IEEE.","DOI":"10.1109\/CLUSTER.2011.41"},{"key":"e_1_2_7_41_1","doi-asserted-by":"crossref","unstructured":"CappelloF CaronE DaydeM et al. Grid'5000: a large scale and highly reconfigurable grid experimental Testbed. Paper presented at: Proceedings of the 6th IEEE\/ACM International Workshop on Grid Computing CDIEEE\/ACM SC'05;2005:99\u2010106; Seattle Washington.","DOI":"10.1109\/GRID.2005.1542730"},{"volume-title":"Matrix Computations","year":"1983","author":"Van Loan CF","key":"e_1_2_7_42_1"},{"issue":"1","key":"e_1_2_7_43_1","first-page":"5","article-title":"Toward exascale resilience: 2014 update","volume":"1","author":"Cappello F","year":"2014","journal-title":"Supercomput Front Innovat"},{"key":"e_1_2_7_44_1","doi-asserted-by":"crossref","unstructured":"HaoP ShamisP VenkataMG et al. Fault tolerance for OpenSHMEM. Paper presented at: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming ModelsPGAS '14; vol. 3 2014:1\u201023; New York NY.","DOI":"10.1145\/2676870.2676894"},{"key":"e_1_2_7_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00607-013-0331-3"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.6179","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/cpe.6179","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.6179","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,23]],"date-time":"2024-08-23T19:40:20Z","timestamp":1724442020000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.6179"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,4]]},"references-count":44,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2021,6,10]]}},"alternative-id":["10.1002\/cpe.6179"],"URL":"https:\/\/doi.org\/10.1002\/cpe.6179","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"type":"print","value":"1532-0626"},{"type":"electronic","value":"1532-0634"}],"subject":[],"published":{"date-parts":[[2021,2,4]]},"assertion":[{"value":"2020-02-18","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-28","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-02-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}