{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,12]],"date-time":"2023-01-12T23:53:21Z","timestamp":1673567601496},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2012,2,15]],"date-time":"2012-02-15T00:00:00Z","timestamp":1329264000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2012,11]]},"DOI":"10.1007\/s11227-012-0749-y","type":"journal-article","created":{"date-parts":[[2012,2,14]],"date-time":"2012-02-14T10:48:47Z","timestamp":1329216527000},"page":"787-803","source":"Crossref","is-referenced-by-count":7,"title":["Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE"],"prefix":"10.1007","volume":"62","author":[{"given":"Jos\u00e9 M.","family":"Cecilia","sequence":"first","affiliation":[]},{"given":"Jos\u00e9 L.","family":"Abell\u00e1n","sequence":"additional","affiliation":[]},{"given":"Juan","family":"Fern\u00e1ndez","sequence":"additional","affiliation":[]},{"given":"Manuel E.","family":"Acacio","sequence":"additional","affiliation":[]},{"given":"Jos\u00e9 M.","family":"Garc\u00eda","sequence":"additional","affiliation":[]},{"given":"Manuel","family":"Ujald\u00f3n","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,2,15]]},"reference":[{"key":"749_CR1","volume-title":"International conference on computational science","author":"JL Abell\u00e1n","year":"2008","unstructured":"Abell\u00e1n JL, Fern\u00e1ndez J, Acacio ME (2008) Characterizing the basic synchronization and communication operations in dual cell-based blades. In: International conference on computational science, Krakow, Poland."},{"key":"749_CR2","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1109\/HPCSIM.2009.5192847","volume-title":"Proceedings of the 2009 high performance computing & simulation conference (HPCS\u201909)","author":"R Amorim","year":"2009","unstructured":"Amorim R, Haase G, Liebmann M, Weber\u00a0dos\u00a0Santos R (2009) Comparing CUDA and OpenGL implementations for a Jacobi iteration. In: Smari WW (ed) Proceedings of the 2009 high performance computing & simulation conference (HPCS\u201909), IEEE, New Jersey. Logos Verlag, Berlin, pp 22\u201332"},{"key":"749_CR3","unstructured":"Asanovic K, Bodik R, Catanzaro BC, Gebis JJ, Husbands P, Keutzer K, Patterson DA, Plishker WL, Shalf J, Williams SW, Yelick KA (2006) The landscape of parallel computing research: a\u00a0view from Berkeley. Tech rep UCB\/EECS-2006-183, EECS Department, University of California, Berkeley"},{"key":"749_CR4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/IPDPS.2009.5161031","volume-title":"Proceedings of the 2009 IEEE international symposium on parallel & distributed processing (IPDPS \u201909)","author":"M Christen","year":"2009","unstructured":"Christen M, Schenk O, Neufeld E, Messmer P, Burkhart H (2009) Parallel data-locality aware stencil computations on modern micro-architectures. In: Proceedings of the 2009 IEEE international symposium on parallel & distributed processing (IPDPS \u201909). IEEE Computer Society, Washington, pp\u00a01\u201310"},{"key":"749_CR5","first-page":"1","volume-title":"Proceedings of the 2008 ACM\/IEEE conference on supercomputing (SC \u201908)","author":"K Datta","year":"2008","unstructured":"Datta K, Murphy M, Volkov V, Williams S, Carter J, Oliker L, Patterson D, Shalf J, Yelick K (2008) Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In: Proceedings of the 2008 ACM\/IEEE conference on supercomputing (SC \u201908). IEEE Press, Piscataway, pp 1\u201312"},{"key":"749_CR6","volume-title":"Society for industrial and applied mathematics","author":"JW Demmel","year":"1997","unstructured":"Demmel JW (1997) Applied numerical linear algebra. In: Society for industrial and applied mathematics. SIAM, Philadelphia"},{"key":"749_CR7","first-page":"234","volume-title":"Proceedings of 23rd international conference (ARCS)","author":"X Fang","year":"2010","unstructured":"Fang X, Tang Y, Wang G, Tang T, Zhang Y (2010) Optimizing stencil application on multi-thread GPU architecture using stream programming model. In: Proceedings of 23rd international conference (ARCS), Hannover, Germany, pp 234\u2013245"},{"key":"749_CR8","first-page":"900","volume-title":"Euro-Par","author":"E Gaona","year":"2009","unstructured":"Gaona E, Fern\u00e1ndez J, Acacio ME (2009) Fast and efficient synchronization and communication collective primitives for dual cell-based blades. In: Euro-Par, pp 900\u2013911"},{"key":"749_CR9","unstructured":"Hill J (2007) Scientific programming on the cell using ALF. Tech rep, HPCx consortium"},{"key":"749_CR10","unstructured":"Systems IBM Technology Group (2007) Cell broadband engine programming tutorial version\u00a02.1"},{"key":"749_CR11","unstructured":"IBM Systems and Technology Group (2007) SPE runtime management library version 2.1"},{"key":"749_CR12","unstructured":"Intel: Array building blocks (2012). http:\/\/software.intel.com\/en-us\/articles\/intel-array-building-blocks\/"},{"issue":"4\/5","key":"749_CR13","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1147\/rd.494.0589","volume":"49","author":"J Kahle","year":"2005","unstructured":"Kahle J, Day M, Hofstee H, Johns C, Maeurer T, Shippy D (2005) Introduction to the cell multiprocessor. IBM J Res Dev 49(4\/5):589\u2013604","journal-title":"IBM J Res Dev"},{"key":"749_CR14","volume-title":"The art of parallel programming","author":"BP Lester","year":"1993","unstructured":"Lester BP (1993) The art of parallel programming. Prentice-Hall, Upper Saddle River"},{"issue":"2","key":"749_CR15","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/MM.2008.31","volume":"28","author":"E Lindholm","year":"2008","unstructured":"Lindholm E, Nickolls J, Oberman S, Montrym J (2008) Nvidia tesla: a\u00a0unified graphics and computing architecture. IEEE MICRO 28(2):39\u201355. http:\/\/doi.ieeecomputersociety.org\/10.1109\/MM.2008.31","journal-title":"IEEE MICRO"},{"key":"749_CR16","first-page":"11:1","volume-title":"Proceedings of 2011 international conference for high performance computing, networking, storage and analysis (SC\u00a0\u201911)","author":"N Maruyama","year":"2011","unstructured":"Maruyama N, Nomura T, Sato K, Matsuoka S (2011) Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis (SC\u00a0\u201911), New York, USA, pp 11:1\u201311:12"},{"issue":"5","key":"749_CR17","first-page":"816","volume":"96","author":"MD McCool","year":"2008","unstructured":"McCool MD (2008) Scalable programming models for massively multicore processors. IEEE MICRO 96(5):816\u2013831","journal-title":"IEEE MICRO"},{"key":"749_CR18","unstructured":"NVIDIA: (2008) NVIDIA CUDA programming guide 2.0"},{"issue":"5","key":"749_CR19","doi-asserted-by":"crossref","first-page":"879","DOI":"10.1109\/JPROC.2008.917757","volume":"96","author":"JD Owens","year":"2008","unstructured":"Owens JD, Houston M, Luebke D, Green S, Stone JE, Phillips JC (2008) Gpu computing. Proc IEEE 96(5):879\u2013899","journal-title":"Proc IEEE"},{"issue":"1","key":"749_CR20","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1111\/j.1467-8659.2007.01012.x","volume":"26","author":"JD Owens","year":"2007","unstructured":"Owens JD, Luebke D, Govindaraju N, Harris M, Kr\u00fcger J, Lefohn AE, Purcell T (2007) A survey of general-purpose computation on graphics hardware. Comput Graph Forum 26(1):80\u2013113","journal-title":"Comput Graph Forum"},{"key":"749_CR21","volume-title":"Proceedings of 21st IEEE international parallel and distributed processing symposium (IPDPS)","author":"L Renganarayana","year":"2007","unstructured":"Renganarayana L, Harthikote-matha M, Dewri R, Rajopadhye S (2007) Towards optimal multi-level tiling for stencil computations. In Proceedings of 21st IEEE international parallel and distributed processing symposium (IPDPS), Long Beach, CA, USA"},{"issue":"3","key":"749_CR22","first-page":"66","volume":"12","author":"JE Stone","year":"2010","unstructured":"Stone JE, Gohara D, Shi G (2010) Opencl: A parallel programming standard for heterogeneous computing systems. IEEE Des Test Comput 12(3):66\u201373. http:\/\/dx.doi.org\/10.1109\/MCSE.2010.69","journal-title":"IEEE Des Test Comput"},{"key":"749_CR23","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1145\/1995896.1995932","volume-title":"Proceedings of the international conference on supercomputing (ICS\u00a0\u201911)","author":"D Unat","year":"2011","unstructured":"Unat D, Cai X, Baden SB (2011) Mint: realizing CUDA performance in 3D stencil methods with annotated C. In: Proceedings of the international conference on supercomputing (ICS\u00a0\u201911). ACM, New York, pp 214\u2013224"},{"key":"749_CR24","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1145\/1542275.1542312","volume-title":"Proceedings of the 23rd international conference on supercomputing (ICS\u00a0\u201909)","author":"S Venkatasubramanian","year":"2009","unstructured":"Venkatasubramanian S, Vuduc RW, None N (2009) Tuned and wildly asynchronous stencil kernels for hybrid cpu\/gpu systems. In: Proceedings of the 23rd international conference on supercomputing (ICS\u00a0\u201909). ACM, New York, pp 244\u2013255"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-012-0749-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s11227-012-0749-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-012-0749-y","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,29]],"date-time":"2021-12-29T04:32:04Z","timestamp":1640752324000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s11227-012-0749-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,2,15]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2012,11]]}},"alternative-id":["749"],"URL":"https:\/\/doi.org\/10.1007\/s11227-012-0749-y","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"value":"0920-8542","type":"print"},{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,2,15]]}}}