{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,3]],"date-time":"2023-09-03T05:41:36Z","timestamp":1693719696497},"reference-count":37,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2014,8,22]],"date-time":"2014-08-22T00:00:00Z","timestamp":1408665600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Concurrency and Computation"],"published-print":{"date-parts":[[2015,3,25]]},"abstract":"Summary<\/jats:title>With the constantly increasing of number of cores in multicore processors, more emphasis should be paid to the on\u2010chip interconnect. Performance and power consumption of an on\u2010chip interconnect are directly affected by the network topology. Researchers have proposed various topologies to optimize these metrics. The efficiency can also be optimized by proper mapping of applications. Therefore in this paper, we propose a novel partially diagonal network\u2010on\u2010chip (PDNOC) design that takes advantage of both heterogeneous network topology and congestion\u2010aware application mapping. We analyse the partially diagonal network in terms of interconnect structure, area usage, power consumption, routing algorithm and implementation complexity. The key insight that enables the PDNOC is that most communication patterns in real\u2010world applications are hot\u2010spot and bursty. We implement a full system simulation environment using SPLASH\u20102 benchmarks. Performance metrics of standard mesh, concentrated mesh, full diagonal mesh and four types of the proposed PDNOC are measured in terms of network latency, application execution time and energy delay product. Evaluation results show that on average, the proposed PDNOC designs provide up to 36% improvement in execution time over concentrated mesh, and 3.6\u00d7 better energy delay product over fully connected diagonal network. PDNOC design with two adjacent PD networks is a better candidate for higher efficiency, while four PD networks provide better performance. Copyright \u00a9 2014 John Wiley & Sons, Ltd.<\/jats:p>","DOI":"10.1002\/cpe.3364","type":"journal-article","created":{"date-parts":[[2014,8,22]],"date-time":"2014-08-22T18:50:36Z","timestamp":1408733436000},"page":"1054-1067","source":"Crossref","is-referenced-by-count":7,"title":["PDNOC: Partially diagonal network\u2010on\u2010chip for high efficiency multicore systems"],"prefix":"10.1002","volume":"27","author":[{"given":"Thomas Canhao","family":"Xu","sequence":"first","affiliation":[{"name":"Department of Information Technology University of Turku Turku 20014 Finland"}]},{"given":"Ville","family":"Lepp\u00e4nen","sequence":"additional","affiliation":[{"name":"Department of Information Technology University of Turku Turku 20014 Finland"}]},{"given":"Pasi","family":"Liljeberg","sequence":"additional","affiliation":[{"name":"Department of Information Technology University of Turku Turku 20014 Finland"}]},{"given":"Juha","family":"Plosila","sequence":"additional","affiliation":[{"name":"Department of Information Technology University of Turku Turku 20014 Finland"}]},{"given":"Hannu","family":"Tenhunen","sequence":"additional","affiliation":[{"name":"Department of Information Technology University of Turku Turku 20014 Finland"}]}],"member":"311","published-online":{"date-parts":[[2014,8,22]]},"reference":[{"key":"e_1_2_7_2_1","unstructured":"Intel.Press kit \u2013 moore's law 40th anniversary February2014. (Available from:Http:\/\/www.intel.com\/pressroom\/kits\/events\/moores_law_40th\/index.htm) [Accessed on June 2014]."},{"key":"e_1_2_7_3_1","doi-asserted-by":"crossref","unstructured":"KumarS JantschA SoininenJP ForsellM MillbergM ObergJ TiensyrjaK HemaniA.A network on chip architecture and design methodology.IEEE Computer Society Annual Symposium on VLSI 2002. Proceedings Pittsburgh PA 2002;105\u2013112. DOI:10.1109\/ISVLSI.2002.1016885.","DOI":"10.1109\/ISVLSI.2002.1016885"},{"key":"e_1_2_7_4_1","doi-asserted-by":"publisher","DOI":"10.1007\/b105353"},{"key":"e_1_2_7_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2013.05.002"},{"key":"e_1_2_7_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2011.06.009"},{"key":"e_1_2_7_7_1","doi-asserted-by":"crossref","unstructured":"BellS EdwardsB AmannJ ConlinR JoyceK LeungV MacKayJ ReifM BaoL JBrown MattinaM MiaoC\u2010C RameyC WentzlaffD AndersonW BergerE FairbanksN DKhan MontenegroF StickneyJ ZookJ.Tile64 \u2010 processor: a 64\u2010core soc with mesh interconnect.Solid\u2010State Circuits Conference 2008. ISSCC 2008. Digest of Technical Papers. IEEE International San Francisco CA 2008;88\u2013598. DOI:10.1109\/ISSCC.2008.4523070.","DOI":"10.1109\/ISSCC.2008.4523070"},{"key":"e_1_2_7_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2007.4378784"},{"key":"e_1_2_7_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2007.4378783"},{"key":"e_1_2_7_10_1","doi-asserted-by":"crossref","unstructured":"TotoniE BehzadB GhikeS TorrellasJ.Comparing the power and performance of intel's scc to state\u2010of\u2010the\u2010art cpus and gpus.2012 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) New Brunswick NJ 2012;78\u201387. DOI:10.1109\/ISPASS.2012.6189208.","DOI":"10.1109\/ISPASS.2012.6189208"},{"key":"e_1_2_7_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/IISWC.2011.6114191"},{"key":"e_1_2_7_12_1","doi-asserted-by":"crossref","unstructured":"XuT GuangL YinA YangB LiljebergP TenhunenH.An analysis of designing 2d\/3d chip multiprocessor wit different cache architecture.NORCHIP 2010 Tampere Finland November2010;1\u20136. DOI:10.1109\/NORCHIP.2010.5669433.","DOI":"10.1109\/NORCHIP.2010.5669433"},{"key":"e_1_2_7_13_1","doi-asserted-by":"crossref","unstructured":"PandeP GrecuC IvanovA SalehR.Design of a switch for network on chip applications.Proceedings of the 2003 International Symposium on Circuits and Systems 2003. ISCAS '03 Vol5 Bangkok Thailand May2003;V\u2013217\u2013V\u2013220 vol.5. DOI: 10.1109\/ISCAS.2003.1206235.","DOI":"10.1109\/ISCAS.2003.1206235"},{"key":"e_1_2_7_14_1","doi-asserted-by":"crossref","unstructured":"KimJ DallyW ScottS AbtsD.Technology\u2010driven highly\u2010scalable dragonfly topology.35th International Symposium on Computer Architecture 2008. ISCA '08 Beijing China 2008;77\u201388. DOI:10.1109\/ISCA.2008.19.","DOI":"10.1109\/ISCA.2008.19"},{"key":"e_1_2_7_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.97897"},{"key":"e_1_2_7_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2007.4378780"},{"key":"e_1_2_7_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2005.134"},{"key":"e_1_2_7_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1183401.1183430"},{"key":"e_1_2_7_19_1","doi-asserted-by":"crossref","unstructured":"DasR EachempatiS MishraA NarayananV DasC.Design and evaluation of a hierarchical on\u2010chip interconnect for next\u2010generation cmps.IEEE 15th International Symposium on High Performance Computer Architecture 2009. HPCA 2009 Raleigh North Carolina USA February2009;175\u2013186. DOI:10.1109\/HPCA.2009.4798252.","DOI":"10.1109\/HPCA.2009.4798252"},{"key":"e_1_2_7_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/L-CA.2007.10"},{"key":"e_1_2_7_21_1","doi-asserted-by":"crossref","unstructured":"WangC BagherzadehN.Design and evaluation of a high throughput qos\u2010aware and congestion\u2010aware router architecture for network\u2010on\u2010chip.2012 20th Euromicro International Conference on Parallel Distributed and Network\u2010Based Processing (PDP) Garching Germany 2012;457\u2013464. DOI:10.1109\/PDP.2012.20.","DOI":"10.1109\/PDP.2012.20"},{"key":"e_1_2_7_22_1","doi-asserted-by":"crossref","unstructured":"WangC HuWH BagherzadehN.Congestion\u2010aware network\u2010on\u2010chip router architecture.2010 15th CSI International Symposium on Computer Architecture and Digital Systems (CADS) Tehran Iran 2010;137\u2013144. DOI:10.1109\/CADS.2010.5623552.","DOI":"10.1109\/CADS.2010.5623552"},{"key":"e_1_2_7_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-8191(00)00063-6"},{"key":"e_1_2_7_24_1","doi-asserted-by":"crossref","unstructured":"TrobecR.Evaluation of d\u2010mesh interconnect for soc.International Conference on Parallel Processing Workshops 2009. ICPPW '09 Vienna Austria September2009;507\u2013512. DOI:10.1109\/ICPPW.2009.74.","DOI":"10.1109\/ICPPW.2009.74"},{"key":"e_1_2_7_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-55224-3_48"},{"key":"e_1_2_7_26_1","doi-asserted-by":"crossref","unstructured":"XuTC PahikkalaT AirolaA LiljebergP PlosilaJ SalakoskiT TenhunenH.Implementation and analysis of block dense matrix decomposition on network\u2010on\u2010chips.2012 IEEE 14th International Conference on High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC\u2010ICESS) Liverpool UK 2012;516\u2013523. DOI:10.1109\/HPCC.2012.76.","DOI":"10.1109\/HPCC.2012.76"},{"key":"e_1_2_7_27_1","doi-asserted-by":"crossref","unstructured":"XuTC LiljebergP PlosilaJ TenhunenH.Evaluate and optimize parallel barnes\u2010hut algorithm for emerging many\u2010core architectures.2013 International Conference on High Performance Computing and Simulation (HPCS) Helsinki Finland July2013;421\u2013428. DOI:10.1109\/HPCSim.2013.6641449.","DOI":"10.1109\/HPCSim.2013.6641449"},{"key":"e_1_2_7_28_1","doi-asserted-by":"crossref","unstructured":"WooS OharaM TorrieE SinghJ GuptaA.The splash\u20102 programs: characterization and methodological considerations.22nd Annual International Symposium on Computer Architecture 1995. Proceedings Santa Margherita Ligure Italy 1995;24\u201336.","DOI":"10.1145\/223982.223990"},{"key":"e_1_2_7_29_1","doi-asserted-by":"crossref","unstructured":"KimJ.Low\u2010cost router microarchitecture for on\u2010chip networks.42nd Annual IEEE\/ACM International Symposium on Microarchitecture 2009. MICRO\u201042 New York NY USA 2009;255\u2013266.","DOI":"10.1145\/1669112.1669145"},{"key":"e_1_2_7_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/DATE.2009.5090700"},{"key":"e_1_2_7_31_1","doi-asserted-by":"crossref","unstructured":"DallyWJ TowlesB.Route packets not wires: on\u2010chip inteconnection networks.Proceedings of the 38th Conference on Design Automation Las Vegas Nevada USA 2001;684\u2013689.","DOI":"10.1145\/378239.379048"},{"key":"e_1_2_7_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2039370.2039405"},{"key":"e_1_2_7_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.982916"},{"key":"e_1_2_7_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1105734.1105747"},{"key":"e_1_2_7_35_1","doi-asserted-by":"crossref","unstructured":"LauterbachG GreenleyD AhmedS BoffeyM ChamdaniJ ChangSE ChenD FangY KHoldbrook HsiehM KeishB MelansonR NarasimhaiahC PetolinoJ PhamT QuachL KTam TongD YangL YauK.Ultrasparc\u2010iii: a 3rd generation 64 b sparc microprocessor.Solid\u2010State Circuits Conference 2000. Digest of Technical Papers. ISSCC. 2000 IEEE International San Francisco CA USA February2000;410\u2013411. DOI:10.1109\/ISSCC.2000.839837.","DOI":"10.1109\/ISSCC.2000.839837"},{"key":"e_1_2_7_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/605397.605420"},{"key":"e_1_2_7_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2401716.2401725"},{"key":"e_1_2_7_38_1","doi-asserted-by":"crossref","unstructured":"PatelA GhoseK.Energy\u2010efficient mesi cache coherence with pro\u2010active snoop filtering for multicore microprocessors.Proceeding of the Thirteenth International Symposium on Low Power Electronics and Design Bangalore India 2008;247\u2013252.","DOI":"10.1145\/1393921.1393988"}],"container-title":["Concurrency and Computation: Practice and Experience"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/api.wiley.com\/onlinelibrary\/tdm\/v1\/articles\/10.1002%2Fcpe.3364","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/cpe.3364","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T21:50:45Z","timestamp":1693691445000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1002\/cpe.3364"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,22]]},"references-count":37,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2015,3,25]]}},"alternative-id":["10.1002\/cpe.3364"],"URL":"https:\/\/doi.org\/10.1002\/cpe.3364","archive":["Portico"],"relation":{},"ISSN":["1532-0626","1532-0634"],"issn-type":[{"value":"1532-0626","type":"print"},{"value":"1532-0634","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,8,22]]}}}