{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,17]],"date-time":"2023-09-17T06:41:04Z","timestamp":1694932864477},"reference-count":63,"publisher":"Wiley","issue":"13","license":[{"start":{"date-parts":[[2016,1,15]],"date-time":"2016-01-15T00:00:00Z","timestamp":1452816000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2016,9,10]]},"abstract":"Summary<\/jats:title>Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi\u2010core to high\u2010performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes that do not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open\u2010source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross\u2010platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi\u2010core nodes, mid\u2010scale distributed clusters of (multi\u2010core) nodes and full\u2010blown high\u2010performance computing systems, comprising large\u2010scale tightly connected networks of multi\u2010core nodes. This requires us to develop new cross\u2010layer programming abstractions in the form of new domain\u2010specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high\u2010performance systems comprising up to 32000 cores, as well as on ubiquitous multi\u2010core systems and distributed clusters. The work reported here paves the way towards full\u2010scale exploitation of symbolic computation by high\u2010performance computing systems, and we demonstrate the potential with two major case studies. \u00a9 2016 The Authors. Concurrency and Computation: Practice and Experience<\/jats:italic> Published by John Wiley & Sons Ltd.<\/jats:p>","DOI":"10.1002\/cpe.3746","type":"journal-article","created":{"date-parts":[[2016,1,15]],"date-time":"2016-01-15T14:17:00Z","timestamp":1452867420000},"page":"3606-3636","source":"Crossref","is-referenced-by-count":3,"title":["HPC\u2010GAP: engineering a 21st\u2010century high\u2010performance computer algebra system"],"prefix":"10.1002","volume":"28","author":[{"given":"Reimer","family":"Behrends","sequence":"first","affiliation":[{"name":"Department of Mathematics University of Kaiserslautern Kaiserslautern Germany"}]},{"given":"Kevin","family":"Hammond","sequence":"additional","affiliation":[{"name":"School of Computer Science University of St Andrews Fife UK"}]},{"given":"Vladimir","family":"Janjic","sequence":"additional","affiliation":[{"name":"School of Computer Science University of St Andrews Fife UK"}]},{"given":"Alexander","family":"Konovalov","sequence":"additional","affiliation":[{"name":"School of Computer Science University of St Andrews Fife UK"}]},{"given":"Steve","family":"Linton","sequence":"additional","affiliation":[{"name":"School of Computer Science University of St Andrews Fife UK"}]},{"given":"Hans\u2010Wolfgang","family":"Loidl","sequence":"additional","affiliation":[{"name":"School of Mathematical and Computer Sciences Heriot\u2010Watt University Edinburgh UK"}]},{"given":"Patrick","family":"Maier","sequence":"additional","affiliation":[{"name":"School of Computing Science University of Glasgow Glasgow UK"}]},{"given":"Phil","family":"Trinder","sequence":"additional","affiliation":[{"name":"School of Computing Science University of Glasgow Glasgow UK"}]}],"member":"311","published-online":{"date-parts":[[2016,1,15]]},"reference":[{"key":"e_1_2_9_2_1","unstructured":"GAP Group.GAP \u2013 Groups algorithms and programming 2007. (Avialable from:http:\/\/www.gap-system.org) [Accessed on October 2015]."},{"key":"e_1_2_9_3_1","unstructured":"BehrendsR MaierP JanjicV.HPC\u2010GAP open access repository 2015. 10.5281\/zenodo.34012."},{"key":"e_1_2_9_4_1","doi-asserted-by":"crossref","unstructured":"BehrendsR KonovalovA LintonS L\u00fcbeckF Neunh\u00f6fferM.Parallelising the computational algebra system GAP. InProceedings of the 4th International Workshop on Parallel Symbolic Computation PASCO 2010 july 21\u201023 2010:Grenoble France 2010;177\u2013178. DOI:10.1145\/1837210.1837239.","DOI":"10.1145\/1837210.1837239"},{"key":"e_1_2_9_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cl.2014.03.001"},{"key":"e_1_2_9_6_1","unstructured":"Janjic V BrownCM NeunhoefferM HammondK LintonSA LoidlHW.Space exploration using parallel orbits. InParCo\u00a02013: Parallel Computing Advances in Parallel Computing vol.\u00a025.IOS Press:Munich Germany 2014;225\u2013232."},{"key":"e_1_2_9_7_1","doi-asserted-by":"crossref","unstructured":"MaierP LiveseyD LoidlHW TrinderP.High\u2010performance computer algebra: a Hecke algebra case study. InEuro\u2010Par\u00a02014.Springer:Porto Portugal 2014;415\u2013426.","DOI":"10.1007\/978-3-319-09873-9_35"},{"key":"e_1_2_9_8_1","doi-asserted-by":"publisher","DOI":"10.1142\/S0218196702001115"},{"key":"e_1_2_9_9_1","unstructured":"Maplesoft.Maple 2015. (Available from:http:\/\/www.maplesoft.com\/products\/Maple\/academic\/) [Accessed on October 2015]."},{"key":"e_1_2_9_10_1","unstructured":"Wolfram Research.Mathematica 2015. (Available from:http:\/\/www.wolfram.com\/mathematica\/) [Accessed on October 2015]."},{"key":"e_1_2_9_11_1","unstructured":"Mathworks.Matlab 2015. (Available from:http:\/\/uk.mathworks.com\/products\/matlab\/) [Accessed on October 2015]."},{"key":"e_1_2_9_12_1","unstructured":"Computational algeb\u00a0groupSchoolofMathematics StatisticsUniversityofSydney.MAGMA computational algebra system. (Available from:http:\/\/magma.maths.usyd.edu.au\/magma\/) [Accessed on October 2015]."},{"key":"e_1_2_9_13_1","unstructured":"SystemSageMathematicalSoftware.Sage 2015. (Available from:http:\/\/www.sagemath.org\/) [Accessed on October 2015]."},{"key":"e_1_2_9_14_1","doi-asserted-by":"crossref","unstructured":"KaltofenEL.Fifteen years after DSC and WLSS2: what parallel computations I do today. InPASCO 2010: International Workshop on Parallel Symbolic Computation.ACM Press:Grenoble France 2010;10\u201317.","DOI":"10.1145\/1837210.1837213"},{"key":"e_1_2_9_15_1","volume-title":"Patterns for Parallel Programming","author":"Mattson TG","year":"2004"},{"key":"e_1_2_9_16_1","volume-title":"Algorithmic Skeletons: Structured Management of Parallel Computation","author":"Cole MI","year":"1989"},{"key":"e_1_2_9_17_1","volume-title":"Structured Parallel Programming","author":"McCool M","year":"2012"},{"key":"e_1_2_9_18_1","volume-title":"Parallel Programming with Microsoft. NET \u2014 Design Patterns for Decomposition and Coordination on Multicore Architectures","author":"Campbell C","year":"2010"},{"key":"e_1_2_9_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_9_20_1","unstructured":"Apache Hadoop NextGen MapReduce (YARN) 2013. (Available from:http:\/\/hadoop.apache.org\/docs\/current2\/hadoop-yarn\/hadoop-yarn-site\/YARN.html) [Accessed on 10 February 2014]."},{"key":"e_1_2_9_21_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4330070305"},{"key":"e_1_2_9_22_1","doi-asserted-by":"crossref","unstructured":"BotorogGH KuchenH.Efficient parallel programming with algorithmic skeletons. InEuro\u2010Par '96: Proceedings of the Second International Euro\u2010par Conference on Parallel Processing Lyon France LNCS\u00a01123. Springer:Lyon France 1996;718\u2013731. 10.1007\/3\u2010540\u201061626\u20108 95.","DOI":"10.1007\/3-540-61626-8_95"},{"key":"e_1_2_9_23_1","doi-asserted-by":"crossref","unstructured":"DarlingtonJ FieldAJ HarrisonPG KellyPaulHJ SharpDWN WuQ.Parallel programming using skeleton functions. InPARLE '93: Proceedings of the 5th International Parle Conference on Parallel Architectures and Languages Europe.Springer:London UK 1993;146\u2013160.","DOI":"10.1007\/3-540-56891-3_12"},{"key":"e_1_2_9_24_1","doi-asserted-by":"crossref","unstructured":"EnmyrenJ KesslerCW.SkePU: A multi\u2010backend skeleton programming library for multi\u2010GPU systems. InProceedings of the Fourth International Workshop on High\u2010Level Parallel Programming and Applications HLPP\u00a0'10.ACM:New York NY USA 2010;5\u201314.","DOI":"10.1145\/1863482.1863487"},{"key":"e_1_2_9_25_1","unstructured":"CiechanowiczP PoldnerM KuchenH.The M\u00fcnster skeleton library Muesli\u2014a comprehensive overview.ERCIS Working Paper No. 7 ISSN 1614\u20107448 European Research Center for Information Systems M\u00fcnster Germany 2009\u00a0 (Available from:http:\/\/www.ercis.org\/sites\/default\/files\/publications\/2012\/ercis_{w}p_{n}o_7.pdf) [Accessed on 18 December 2015]."},{"key":"e_1_2_9_26_1","doi-asserted-by":"crossref","unstructured":"BenoitA ColeM GilmoreS HillstonJ.Flexible skeletal programming with Eskel. InProceedings of the 11th International Euro\u2010Par Conference on Parallel Processing Euro\u2010Par'05.Springer:Berlin Heidelberg 2005;761\u2013770.","DOI":"10.1007\/11549468_83"},{"key":"e_1_2_9_27_1","doi-asserted-by":"crossref","unstructured":"DumasJG GautierT PernetC SaundersB..LinBox founding scope allocation parallel building blocks and separate compilation. InICMS'2010: Mathematical Software LNCS\u00a06327.Springer:Kobe Japan 2010;77\u201383. 10.1007\/978\u20103\u2010642\u201015582\u20106 16.","DOI":"10.1007\/978-3-642-15582-6_16"},{"key":"e_1_2_9_28_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342007078442"},{"key":"e_1_2_9_29_1","unstructured":"CharlesP GrothoffC SaraswatVA DonawaC KielstraA EbciogluK vonPraunC SarkarV.X10: An object\u2010oriented approach to non\u2010uniform cluster computing. InOOPSLA\u00a02005: Conference on Object\u2010Oriented Programming Systems Languages and Applications.ACM Press:San Diego USA 2005;519\u2013538."},{"key":"e_1_2_9_30_1","unstructured":"AllenE ChaseD HallettJ LuchangcoV MaessenJW RyuS SteeleGL Jr. Tobin\u2010HochstadtS.The Fortress language specification version 1.0. Technical Report Sun Microsystems Inc. Santa Clara CA United States 2008."},{"key":"e_1_2_9_31_1","unstructured":"UPC Consortium.Unified parallel C language spec. v1.3. Teachnical Report Lawrence Berkeley National Lab Berkeley CA USA 2013."},{"key":"e_1_2_9_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/289918.289920"},{"key":"e_1_2_9_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-55826-9"},{"key":"e_1_2_9_34_1","doi-asserted-by":"crossref","unstructured":"RochJL VillardG.Parallel computer algebra. Technical Report IMAG France 1997\u00a0 Tutorial at ISSAC\u00a01997: International Symposium on Symbolic and Algebraic Computation.","DOI":"10.1145\/258726.276957"},{"key":"e_1_2_9_35_1","doi-asserted-by":"crossref","unstructured":"MazaMM StephensonB WattSM XieY.Multiprocessed parallelism support in ALDOR on SMPs and multicores. InInternational Workshop on Parallel Symbolic Computation PASCO\u00a0'07.ACM Press:London Ontario Canada 2007;60\u201368.","DOI":"10.1145\/1278177.1278188"},{"key":"e_1_2_9_36_1","doi-asserted-by":"crossref","unstructured":"K\u00fcchlinW.PARSAC\u20102: A parallel SAC\u20102 based on threads. InAAECC\u20108: Applied Algebra Algebraic Algorithms and Error\u2010Correcting Codes LNCS\u00a0508.Springer Tokyo Japan 1991;341\u2013353. 10.1007\/3\u2010540\u201054195\u20100_63.","DOI":"10.1007\/3-540-54195-0_63"},{"issue":"1","key":"e_1_2_9_37_1","first-page":"111","article-title":"The design of the saclib\/paclib kernels","volume":"19","author":"Schreiner W","year":"1995","journal-title":"Journal of Symbolic Computation"},{"key":"e_1_2_9_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0747-7171(02)00137-2"},{"key":"e_1_2_9_39_1","doi-asserted-by":"crossref","unstructured":"CharBW.Progress report on a system for general\u2010purpose parallel symbolic algebraic computation. InISSAC\u00a01990: International Symposium on Symbolic and Algebraic Computation.ACM Press:Tokyo Japan 1990;96\u2013103. 10.1145\/96877.96902.","DOI":"10.1145\/96877.96902"},{"key":"e_1_2_9_40_1","doi-asserted-by":"crossref","unstructured":"CoopermanG..GAP\/MPI: facilitating parallelism. InProceedings of DIMACS Workshop on Groups and Computation II 28 DIMACS Series in Discrete Mathematics and Theoretical Computer Science.AMS:Piscataway NJ USA 1997.","DOI":"10.1090\/dimacs\/028\/06"},{"key":"e_1_2_9_41_1","doi-asserted-by":"crossref","unstructured":"CoopermanG.TOP\u2010C: task\u2010oriented parallel C for distributed and shared memory. InWorkshop on Wide Area Networks and High Performance Computing LNCS\u00a0249.Springer:Essen Germany 1999;109\u2013117. 10.1007\/BFb0110081.","DOI":"10.1007\/BFb0110081"},{"key":"e_1_2_9_42_1","doi-asserted-by":"crossref","unstructured":"CoopermanG TselmanM.New sequential and parallel algorithms for generating high dimension Hecke algebras using the condensation technique. InISSAC\u00a01996: International Symposium on Symbolic and Algebraic Computation.ACM Press:Zurich Switzerland 1996;155\u2013160. DOI:10.1145\/236869.236927.","DOI":"10.1145\/236869.236927"},{"key":"e_1_2_9_43_1","doi-asserted-by":"crossref","unstructured":"KunkleD. SlaviciV. CoopermanG..Parallel disk\u2010based computation for large monolithic binary decision diagrams. InPASCO\u00a02010: International Workshop on Parallel Symbolic Computation.ACM Press:Grenoble France 2010;63\u201372.10.1145\/1837210.1837222.","DOI":"10.1145\/1837210.1837222"},{"key":"e_1_2_9_44_1","doi-asserted-by":"crossref","unstructured":"SibertEE MattsonHF JacksonP.Finite field arithmetic using the connection machine. InCAP\u00a01990: International Workshop on Computer Algebra and Parallellism LNCS\u00a0584.Springer:Ithaca USA 1992;51\u201361.10.1007\/3\u2010540\u201055328\u20102\u20104.","DOI":"10.1007\/3-540-55328-2_4"},{"key":"e_1_2_9_45_1","doi-asserted-by":"crossref","unstructured":"RochJL S\u00e9n\u00e9chaudP Fran\u00e7oise Siebert\u2010RochF VillardG.Computer algebra on MIMD machine. InISSAC\u00a01998: International Symposium on Symbolic and Algebraic Computation LNCS\u00a0358.Springer:Rome Italy 1989;423\u2013439. DOI:10.1007\/3\u2010540\u201051084\u20102_40.","DOI":"10.1007\/3-540-51084-2_40"},{"key":"e_1_2_9_46_1","doi-asserted-by":"crossref","unstructured":"NeunW MelenkH.Very large Gr\u00f6bner basis calculations. InInternational Workshop on Computer Algebra and Parallellism CAP\u00a01990 LNCS\u00a0584.Springer:Ithaca USA 1992;89\u201399. 10.1007\/3\u2010540\u201055328\u20102_7.","DOI":"10.1007\/3-540-55328-2_7"},{"key":"e_1_2_9_47_1","unstructured":"LoustaunauP WangPY.Towards efficient parallelizations of a computer algebra algorithm. InFrontiers of Massively Parallel Computation.IEEE McLean VA USA 1992;67\u201374. 10.1109\/FMPC.1992.234903."},{"key":"e_1_2_9_48_1","unstructured":"Maplesoft.Maple grid computing toolbox. (Available from:http:\/\/www.maplesoft.com\/products\/toolboxes\/GridComputing) [Accessed on October 2015]."},{"key":"e_1_2_9_49_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jsc.2011.12.019"},{"key":"e_1_2_9_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/4472.4478"},{"key":"e_1_2_9_51_1","doi-asserted-by":"crossref","unstructured":"KonovalovA LintonS.Parallel computations in modular group algebras. InPASCO\u00a02010: International Workshop on Parallel Symbolic Computation.ACM Press:Grenoble France 2010;141\u2013149.","DOI":"10.1145\/1837210.1837231"},{"key":"e_1_2_9_52_1","unstructured":"GastineauM..TRIP home page. Web page. (Available from:http:\/\/www.imcce.fr\/Equipes\/ASD\/trip\/trip.php) [Accessed on October 2015]."},{"key":"e_1_2_9_53_1","doi-asserted-by":"crossref","unstructured":"GastineauM LaskarJ.Development of TRIP: fast sparse multivariate polynomial multiplication using burst tries. InICCS 2006: Computational Science LNCS 3992.Springer:Reading UK 2006;446\u2013453. 10.1007\/11758525_60.","DOI":"10.1007\/11758525_60"},{"key":"e_1_2_9_54_1","unstructured":"MetznerT RadimerskyM SorgatzA WehmeierS.User's guide to macro parallelism in MuPAD 1.4.1 1999."},{"key":"e_1_2_9_55_1","unstructured":"GridMathematica. (Available from:http:\/\/www.wolfram.com\/gridmathematica\/) [Accessed on October 2015]."},{"key":"e_1_2_9_56_1","doi-asserted-by":"crossref","unstructured":"CoopermanG.Parallel GAP: mature interactive parallel computing. InConference on Groups and Computation III.De\u00a0Gruyter Columbus OH USA 2001;123\u2013138.","DOI":"10.1515\/9783110872743.123"},{"key":"e_1_2_9_57_1","doi-asserted-by":"crossref","unstructured":"Al\u00a0ZainA HammondK TrinderPW LintonS LoidlHW ConstantiM.SymGrid\u2010Par: designing a framework for executing computational algebra systems on computational grids. InInternational Conference on Computational Science ICCS\u00a02007 LNCS\u00a04488.Springer:Beijing China 2007;617\u2013624. 10.1007\/978\u20103\u2010540\u201072586\u20102_90.","DOI":"10.1007\/978-3-540-72586-2_90"},{"key":"e_1_2_9_58_1","doi-asserted-by":"crossref","unstructured":"MaierP TrinderPW.Implementing a high\u2010level distributed\u2010memory parallel Haskell in Haskell. InIFL\u00a02011: International Symposium on Implementation and Application of Functional Languages LNCS\u00a07257.Springer:Lawrence Kansas USA 2012;35\u201350. 10.1007\/978\u20103\u2010642\u201034407\u20107_3.","DOI":"10.1007\/978-3-642-34407-7_3"},{"key":"e_1_2_9_59_1","doi-asserted-by":"crossref","unstructured":"MaierP StewartR TrinderPW.The HdpH DSLs for scalable reliable computation. InHaskell\u00a02014.ACM Press:Gothenburg Sweden 2014;65\u201376. 10.1145\/2633357.2633363.","DOI":"10.1145\/2633357.2633363"},{"key":"e_1_2_9_60_1","doi-asserted-by":"crossref","unstructured":"JanjicV HammondK.Granularity\u2010aware work\u2010stealing for computationally\u2010uniform grids. InCCGRID 2010.IEEE:Melbourne Australia 2010;123\u2013134. 10.1109\/CCGRID.2010.49.","DOI":"10.1109\/CCGRID.2010.49"},{"key":"e_1_2_9_61_1","unstructured":"StewartR.Reliable massively parallel symbolic computing: fault tolerance for a distributed Haskell.Ph.D. Thesis School of Mathematical and Computer Sciences Heriot\u2010Watt University 2013."},{"key":"e_1_2_9_62_1","unstructured":"Edinburgh Parallel Computing Center (EPCC).HECToR National UK Super Computing Resource Edinburgh 2013. (Available from:http:\/\/www.hector.ac.uk\/) [Accessed on 18 December 2015]."},{"key":"e_1_2_9_63_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00222-007-0053-2"},{"key":"e_1_2_9_64_1","doi-asserted-by":"publisher","DOI":"10.1017\/S095679681600006X"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.3746","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.3746","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/cpe.3746","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.3746","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,17]],"date-time":"2023-09-17T00:52:00Z","timestamp":1694911920000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.3746"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,1,15]]},"references-count":63,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2016,9,10]]}},"alternative-id":["10.1002\/cpe.3746"],"URL":"https:\/\/doi.org\/10.1002\/cpe.3746","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"value":"1532-0626","type":"print"},{"value":"1532-0634","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,1,15]]}}}