{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,7]],"date-time":"2024-09-07T01:09:02Z","timestamp":1725671342995},"publisher-location":"New York, NY, USA","reference-count":17,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2014,2,7]]},"DOI":"10.1145\/2578948.2560692","type":"proceedings-article","created":{"date-parts":[[2014,2,7]],"date-time":"2014-02-07T14:23:08Z","timestamp":1391782988000},"page":"10-20","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Reduction Operations in Parallel Loops for GPGPUs"],"prefix":"10.1145","author":[{"given":"Rengan","family":"Xu","sequence":"first","affiliation":[{"name":"Dept. of Computer Science, University of Houston, Houston, TX, 77004, USA"}]},{"given":"Xiaonan","family":"Tian","sequence":"additional","affiliation":[{"name":"Dept. of Computer Science, University of Houston, Houston, TX, 77004, USA"}]},{"given":"Yonghong","family":"Yan","sequence":"additional","affiliation":[{"name":"Dept. of Computer Science, University of Houston, Houston, TX, 77004, USA"}]},{"given":"Sunita","family":"Chandrasekaran","sequence":"additional","affiliation":[{"name":"Dept. of Computer Science, University of Houston, Houston, TX, 77004, USA"}]},{"given":"Barbara","family":"Chapman","sequence":"additional","affiliation":[{"name":"Dept. of Computer Science, University of Houston, Houston, TX, 77004, USA"}]}],"member":"320","published-online":{"date-parts":[[2014,2,7]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"October","author":"CUDA.","year":"2013","unstructured":"CUDA. http:\/\/www.nvidia.com\/object\/cuda_home_new.html , October 2013 . CUDA. http:\/\/www.nvidia.com\/object\/cuda_home_new.html, October 2013."},{"key":"e_1_3_2_1_2_1","volume-title":"June","author":"ACC.","year":"2013","unstructured":"Open ACC. http:\/\/www.openacc-standard.org , June 2013 . OpenACC. http:\/\/www.openacc-standard.org, June 2013."},{"key":"e_1_3_2_1_3_1","volume-title":"November","author":"Reduction CL","year":"2013","unstructured":"Open CL Reduction . http:\/\/developer.amd.com\/resources\/documentation-articles\/articles-whitepapers\/opencl-optimization-case-study-simple-reductions\/ , November 2013 . OpenCL Reduction. http:\/\/developer.amd.com\/resources\/documentation-articles\/articles-whitepapers\/opencl-optimization-case-study-simple-reductions\/, November 2013."},{"key":"e_1_3_2_1_4_1","volume-title":"October","author":"Standard CL","year":"2013","unstructured":"Open CL Standard . http:\/\/www.khronos.org\/opencl , October 2013 . OpenCL Standard. http:\/\/www.khronos.org\/opencl, October 2013."},{"key":"e_1_3_2_1_5_1","volume-title":"October","author":"MP.","year":"2013","unstructured":"Open MP. http:\/\/www.openmp.org , October 2013 . OpenMP. http:\/\/www.openmp.org, October 2013."},{"key":"e_1_3_2_1_6_1","volume-title":"November","author":"Implementation GNU","year":"2013","unstructured":"The GNU OpenMP Implementation . http:\/\/gcc.gnu.org\/onlinedocs\/libgomp.pdf , November 2013 . The GNU OpenMP Implementation. http:\/\/gcc.gnu.org\/onlinedocs\/libgomp.pdf, November 2013."},{"key":"e_1_3_2_1_7_1","volume-title":"Programming with POSIX (R) Threads","author":"Butenhof D.","year":"1997","unstructured":"D. Butenhof . Programming with POSIX (R) Threads . Addison-Wesley Professional , 1997 . D. Butenhof. Programming with POSIX (R) Threads. Addison-Wesley Professional, 1997."},{"key":"e_1_3_2_1_8_1","volume-title":"Newnes","author":"Cook S.","year":"2012","unstructured":"S. Cook . CUDA Programming : A Developer's Guide to Parallel Computing with GPUs . Newnes , 2012 . S. Cook. CUDA Programming: A Developer's Guide to Parallel Computing with GPUs. Newnes, 2012."},{"key":"e_1_3_2_1_9_1","volume-title":"HMPP: A Hybrid Multi-core Parallel Programming Environment. In Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007)","author":"Dolbeau R.","year":"2007","unstructured":"R. Dolbeau , S. Bihan , and F. Bodin . HMPP: A Hybrid Multi-core Parallel Programming Environment. In Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007) , 2007 . R. Dolbeau, S. Bihan, and F. Bodin. HMPP: A Hybrid Multi-core Parallel Programming Environment. In Workshop on General Purpose Processing on Graphics Processing Units (GPGPU 2007), 2007."},{"key":"e_1_3_2_1_10_1","first-page":"6","year":"2007","unstructured":"M. Harris. Optimizing Parallel Reduction in CUDA. NVIDIA Developer Technology , 6 , 2007 . M. Harris. Optimizing Parallel Reduction in CUDA. NVIDIA Developer Technology, 6, 2007.","journal-title":"M. Harris. Optimizing Parallel Reduction in CUDA. NVIDIA Developer Technology"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2013.35"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.v19:18"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-02303-3_4"},{"key":"e_1_3_2_1_14_1","volume-title":"Cambridge cuda course 25-27 may","author":"Pullan G.","year":"2009","unstructured":"G. Pullan . Cambridge cuda course 25-27 may 2009 . http:\/\/www.many-core.group.cam.ac.uk\/archive\/CUDAcourse09\/. G. Pullan. Cambridge cuda course 25-27 may 2009. http:\/\/www.many-core.group.cam.ac.uk\/archive\/CUDAcourse09\/."},{"key":"e_1_3_2_1_15_1","volume-title":"Compiling A High-Level Directive-based Programming Model for Accelerators. In LCPC 2013: The 26th International Workshop on Languages and Compilers for Parallel Computing","author":"Tian X.","year":"2013","unstructured":"X. Tian , R. Xu , Y. Yan , Z. Yun , S. Chandrasekaran , and B. Chapman . Compiling A High-Level Directive-based Programming Model for Accelerators. In LCPC 2013: The 26th International Workshop on Languages and Compilers for Parallel Computing , 2013 . X. Tian, R. Xu, Y. Yan, Z. Yun, S. Chandrasekaran, and B. Chapman. Compiling A High-Level Directive-based Programming Model for Accelerators. In LCPC 2013: The 26th International Workshop on Languages and Compilers for Parallel Computing, 2013."},{"key":"e_1_3_2_1_16_1","volume-title":"Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs. nVidia technical white paper","author":"Whitehead N.","year":"2011","unstructured":"N. Whitehead and A. Fit-Florea . Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs. nVidia technical white paper , 2011 . N. Whitehead and A. Fit-Florea. Precision & Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs. nVidia technical white paper, 2011."},{"key":"e_1_3_2_1_17_1","volume-title":"Storage and Analysis (SCC), 2012","author":"Xu R.","year":"2012","unstructured":"R. Xu , S. Chandrasekaran , B. Chapman , and C. F. Eick . Directive-based Programming Models for Scientific Applications-A Comparison. In High Performance Computing, Networking , Storage and Analysis (SCC), 2012 SC Companion, pages 1--9. IEEE , 2012 . R. Xu, S. Chandrasekaran, B. Chapman, and C. F. Eick. Directive-based Programming Models for Scientific Applications-A Comparison. In High Performance Computing, Networking, Storage and Analysis (SCC), 2012 SC Companion, pages 1--9. IEEE, 2012."}],"event":{"name":"PPoPP '14: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"],"location":"Orlando FL USA","acronym":"PPoPP '14"},"container-title":["Proceedings of Programming Models and Applications on Multicores and Manycores"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2578948.2560692","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,11]],"date-time":"2023-01-11T15:06:05Z","timestamp":1673449565000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2578948.2560692"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,2,7]]},"references-count":17,"alternative-id":["10.1145\/2578948.2560692","10.1145\/2578948"],"URL":"https:\/\/doi.org\/10.1145\/2578948.2560692","relation":{},"subject":[],"published":{"date-parts":[[2014,2,7]]},"assertion":[{"value":"2014-02-07","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}