{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,8]],"date-time":"2024-09-08T15:28:58Z","timestamp":1725809338007},"publisher-location":"New York, NY, USA","reference-count":34,"publisher":"ACM","license":[{"start":{"date-parts":[[2016,9,11]],"date-time":"2016-09-11T00:00:00Z","timestamp":1473552000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EU FET-HPC ExaNoDe","award":["H2020-671578"]},{"name":"Royal Academy of Engineering University Research Fellowship"},{"name":"UK EPSRC","award":["EP\/M004880\/1"]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2016,9,11]]},"DOI":"10.1145\/2967938.2967946","type":"proceedings-article","created":{"date-parts":[[2016,8,31]],"date-time":"2016-08-31T12:32:08Z","timestamp":1472646728000},"page":"125-137","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Scalable Task Parallelism for NUMA"],"prefix":"10.1145","author":[{"given":"Andi","family":"Drebes","sequence":"first","affiliation":[{"name":"The University of Manchester, Manchester, United Kingdom"}]},{"given":"Antoniu","family":"Pop","sequence":"additional","affiliation":[{"name":"The University of Manchester, Manchester, United Kingdom"}]},{"given":"Karine","family":"Heydemann","sequence":"additional","affiliation":[{"name":"Sorbonne Universit\u00e9s, UPMC Univ Paris 06, Paris, France"}]},{"given":"Albert","family":"Cohen","sequence":"additional","affiliation":[{"name":"INRIA and \u00c9cole Normale Sup\u00e9rieure, Paris, France"}]},{"given":"Nathalie","family":"Drach","sequence":"additional","affiliation":[{"name":"Sorbonne Universit\u00e9s, UPMC Univ Paris 06, Paris, France"}]}],"member":"320","published-online":{"date-parts":[[2016,9,11]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1468075.1468121"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/370049.370455"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/209936.209958"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/324133.324234"},{"key":"e_1_3_2_1_5_1","first-page":"1","volume-title":"2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)","author":"Broquedis F.","unstructured":"F. Broquedis , O. Aumage , B. Goglin , S. Thibault , P.-A. Wacrenier , and R. Namyst . Structuring the execution of openmp applications for multicore architectures . In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) , pages 1 -- 10 . IEEE. F. Broquedis, O. Aumage, B. Goglin, S. Thibault, P.-A. Wacrenier, and R. Namyst. Structuring the execution of openmp applications for multicore architectures. In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pages 1--10. IEEE."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10766-010-0136-3"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-30961-8_8"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1155\/2010\/521797"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2093157.2093165"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1094811.1094852"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597665"},{"key":"e_1_3_2_1_12_1","unstructured":"J. Corbet. NUMA in a hurry Nov. 2012. J. Corbet. NUMA in a hurry Nov. 2012."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2451116.2451157"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2641764"},{"key":"e_1_3_2_1_15_1","volume-title":"Advanced Micro Devices","author":"Drongowski P. J.","year":"2007","unstructured":"P. J. Drongowski . Instruction-Based Sampling : A New Performance Analysis Technique for AMD Family 10h Processors . Advanced Micro Devices , November 2007 . P. J. Drongowski. Instruction-Based Sampling: A New Performance Analysis Technique for AMD Family 10h Processors. Advanced Micro Devices, November 2007."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/1278177.1278182"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2009.5161101"},{"key":"e_1_3_2_1_18_1","unstructured":"Intel Corporation. Intel Math Kernel Library. https:\/\/software.intel.com\/en-us\/intel-mkl accessed 01\/2015. Intel Corporation. Intel Math Kernel Library. https:\/\/software.intel.com\/en-us\/intel-mkl accessed 01\/2015."},{"key":"e_1_3_2_1_19_1","unstructured":"Intel Corporation. Threading Building Blocks. https:\/\/www.threadingbuildingblocks.org\/ accessed 09\/2015. Intel Corporation. Threading Building Blocks. https:\/\/www.threadingbuildingblocks.org\/ accessed 09\/2015."},{"key":"e_1_3_2_1_20_1","first-page":"263","volume-title":"Proceedings of the 2015 Usenix Annual Technical Conference, USENIX ATC '15","author":"Kaestle S.","year":"2015","unstructured":"S. Kaestle , R. Achermann , T. Roscoe , and T. Harris . Shoal: Smart allocation and replication of memory for parallel programs . In Proceedings of the 2015 Usenix Annual Technical Conference, USENIX ATC '15 , pages 263 -- 276 , Berkeley, CA, USA , 2015 . USENIX Association. S. Kaestle, R. Achermann, T. Roscoe, and T. Harris. Shoal: Smart allocation and replication of memory for parallel programs. In Proceedings of the 2015 Usenix Annual Technical Conference, USENIX ATC '15, pages 263--276, Berkeley, CA, USA, 2015. USENIX Association."},{"key":"e_1_3_2_1_21_1","volume-title":"Apr.","author":"Kleen A.","year":"2005","unstructured":"A. Kleen . A NUMA API for Linux , Apr. 2005 . A. Kleen. A NUMA API for Linux, Apr. 2005."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2687652"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2442516.2442524"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/2813767.2813788"},{"key":"e_1_3_2_1_25_1","first-page":"387","volume-title":"Proceedings of the 19th Annual International Conference on Supercomputing, ICS '05","author":"H. L\u00f6","year":"2005","unstructured":"H. L\u00f6 PDE Solver on a cc-NUMA System . In Proceedings of the 19th Annual International Conference on Supercomputing, ICS '05 , pages 387 -- 392 , New York, NY, USA , 2005 . ACM. H. L\u00f6 PDE Solver on a cc-NUMA System. In Proceedings of the 19th Annual International Conference on Supercomputing, ICS '05, pages 387--392, New York, NY, USA, 2005. ACM."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2688500.2688509"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/563647.563657"},{"key":"e_1_3_2_1_28_1","volume-title":"July","author":"Architecture Review Board MP","year":"2013","unstructured":"Open MP Architecture Review Board . OpenMP Application Program Interface Version 4.0 , July 2013 . OpenMP Architecture Review Board. OpenMP Application Program Interface Version 4.0, July 2013."},{"key":"e_1_3_2_1_29_1","unstructured":"http:\/\/www.openstream.info. http:\/\/www.openstream.info."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342009106195"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2400682.2400712"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1988915.1988918"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/SBAC-PAD.2009.16"},{"key":"e_1_3_2_1_35_1","volume-title":"QUARK Users' Guide - QUeueing And Runtime for Kernels","author":"YarKhan A.","year":"2011","unstructured":"A. YarKhan , J. Kurzak , and J. Dongarra . QUARK Users' Guide - QUeueing And Runtime for Kernels , 2011 . http:\/\/ash2.icl.utk.edu\/sites\/ash2.icl.utk.edu\/les\/publications\/2011\/icl-utk-454--2011.pdf, accessed 10\/2014. A. YarKhan, J. Kurzak, and J. Dongarra. QUARK Users' Guide - QUeueing And Runtime for Kernels, 2011. http:\/\/ash2.icl.utk.edu\/sites\/ash2.icl.utk.edu\/les\/publications\/2011\/icl-utk-454--2011.pdf, accessed 10\/2014."}],"event":{"name":"PACT '16: International Conference on Parallel Architectures and Compilation","sponsor":["IFIP WG 10.3 IFIP WG 10.3","IEEE TCCA IEEE Computer Society Technical Committee on Computer Architecture","SIGARCH ACM Special Interest Group on Computer Architecture","IEEE CS TCPP IEEE Computer Society Technical Committee on Parallel Processing"],"location":"Haifa Israel","acronym":"PACT '16"},"container-title":["Proceedings of the 2016 International Conference on Parallel Architectures and Compilation"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2967938.2967946","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,5]],"date-time":"2023-01-05T15:12:44Z","timestamp":1672931564000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2967938.2967946"}},"subtitle":["A Uniform Abstraction for Coordinated Scheduling and Memory Management"],"short-title":[],"issued":{"date-parts":[[2016,9,11]]},"references-count":34,"alternative-id":["10.1145\/2967938.2967946","10.1145\/2967938"],"URL":"https:\/\/doi.org\/10.1145\/2967938.2967946","relation":{},"subject":[],"published":{"date-parts":[[2016,9,11]]},"assertion":[{"value":"2016-09-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}