{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,24]],"date-time":"2024-08-24T12:59:47Z","timestamp":1724504387317},"reference-count":60,"publisher":"Springer Science and Business Media LLC","issue":"16","license":[{"start":{"date-parts":[[2023,5,11]],"date-time":"2023-05-11T00:00:00Z","timestamp":1683763200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,5,11]],"date-time":"2023-05-11T00:00:00Z","timestamp":1683763200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Copenhagen Business School Library"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2023,11]]},"abstract":"Abstract<\/jats:title>Applications running in a large and complex manycore system can significantly benefit from adopting the dataflow model of computation. In a dataflow execution environment, a thread can run only if all its required inputs are available. While the potential benefits are large, it is not trivial to improve resource utilization and energy efficiency by focusing on dataflow thread execution models (i.e., the ways specifying how the threads adhering to a dataflow model of computation execute on a given compute\/communication architecture). This paper proposes and implements a hardware-software co-design-based dataflow threads management framework. It works at the Network-on-Chip (NoC) level and consists of three stages. The first stage focuses on a fast and effective thread distribution policy. The next stage proposes an approach that adds reconfigurability to a 2D mesh NoC via customized instructions to manage the dataflow thread distribution. Finally, a 2D mesh and ring-based hybrid NoC is proposed for better scalability and higher performance. This work can be considered a primary reference framework from which extensions can be carried out.<\/jats:p>","DOI":"10.1007\/s11227-023-05335-8","type":"journal-article","created":{"date-parts":[[2023,5,11]],"date-time":"2023-05-11T14:02:33Z","timestamp":1683813753000},"page":"17983-18020","update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["NoC-based hardware software co-design framework for dataflow thread management"],"prefix":"10.1007","volume":"79","author":[{"given":"Somnath","family":"Mazumdar","sequence":"first","affiliation":[]},{"given":"Alberto","family":"Scionti","sequence":"additional","affiliation":[]},{"given":"St\u00e9phane","family":"Zuckerman","sequence":"additional","affiliation":[]},{"given":"Antoni","family":"Portero","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,5,11]]},"reference":[{"key":"5335_CR1","doi-asserted-by":"crossref","unstructured":"Shin W, Oles V, Karimi AM, Ellis JA, Wang F (2021) Revealing power, energy and thermal dynamics of a 200pf pre-exascale supercomputer. In: Proceedings of the international conference for high performance computing, networking, storage and analysis. Association for computing machinery. New York","DOI":"10.1145\/3458817.3476188"},{"issue":"1","key":"5335_CR2","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1109\/MSPEC.2022.9676353","volume":"59","author":"D Schneider","year":"2022","unstructured":"Schneider D (2022) The Exascale Era is upon us: the frontier supercomputer may be the first to reach 1,000,000,000,000,000,000 operations per second. IEEE Spectr 59(1):34\u201335. https:\/\/doi.org\/10.1109\/MSPEC.2022.9676353","journal-title":"IEEE Spectr"},{"key":"5335_CR3","doi-asserted-by":"publisher","unstructured":"Sato M, Ishikawa Y, Tomita H, Kodama Y, Odajima T, Tsuji M, Yashiro H, Aoki M, Shida N, Miyoshi I, Hirai K, Furuya A, Asato A, Morita K, Shimizu T (2020) Co-design for a64fx manycore processor and \u201cfugaku\u201d. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp 1\u201315. https:\/\/doi.org\/10.1109\/SC41405.2020.00051","DOI":"10.1109\/SC41405.2020.00051"},{"key":"5335_CR4","unstructured":"Jia Z, Tillman B, Maggioni M, Scarpazza DP (2019) Dissecting the graphcore IPU architecture via microbenchmarking. arXiv preprint arXiv:1912.03413"},{"key":"5335_CR5","unstructured":"Louw T, McIntosh-Smith S (2021) Using the graphcore IPU for traditional HPC applications. In: 3rd Workshop on Accelerated Machine Learning (AccML)"},{"issue":"2","key":"5335_CR6","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1109\/MM.2021.3061912","volume":"41","author":"J Vasiljevic","year":"2021","unstructured":"Vasiljevic J, Bajic L, Capalija D, Sokorac S, Ignjatovic D, Bajic L, Trajkovic M, Hamer I, Matosevic I, Cejkov A et al (2021) Compute substrate for software 2.0. IEEE Micro 41(2):50\u201355","journal-title":"IEEE Micro"},{"issue":"5","key":"5335_CR7","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1109\/MC.2006.180","volume":"39","author":"EA Lee","year":"2006","unstructured":"Lee EA (2006) The problem with threads. Computer 39(5):33\u201342","journal-title":"Computer"},{"issue":"9","key":"5335_CR8","doi-asserted-by":"publisher","first-page":"1002","DOI":"10.14778\/3329772.3329777","volume":"12","author":"M Hoffmann","year":"2019","unstructured":"Hoffmann M, Lattuada A, McSherry F, Kalavri V, Liagouris J, Roscoe T (2019) Megaphone: latency-conscious state migration for distributed streaming dataflows. Proc VLDB Endow 12(9):1002\u20131015","journal-title":"Proc VLDB Endow"},{"key":"5335_CR9","doi-asserted-by":"crossref","unstructured":"Nowatzki T, Gangadhar V, Sankaralingam K (2015) Exploring the potential of heterogeneous von neumann\/dataflow execution models. In: Proceedings of the 42nd Annual International Symposium on Computer Architecture. ACM, pp 298\u2013310","DOI":"10.1145\/2749469.2750380"},{"key":"5335_CR10","doi-asserted-by":"crossref","unstructured":"Gostelow KP, Plouffe W, et al (1977) Indeterminacy, monitors, and dataflow. In: ACM SIGOPS Operating Systems Review. vol 11. ACM, pp 159\u2013169","DOI":"10.1145\/1067625.806559"},{"key":"5335_CR11","doi-asserted-by":"crossref","unstructured":"Barrow-Williams N, Fensch C, Moore S (2009) A communication characterisation of splash-2 and parsec. In: Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on. IEEE, pp 86\u201397","DOI":"10.1109\/IISWC.2009.5306792"},{"issue":"5","key":"5335_CR12","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1109\/MM.2007.4378783","volume":"27","author":"Y Hoskote","year":"2007","unstructured":"Hoskote Y, Vangal S, Singh A, Borkar N, Borkar S (2007) A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro 27(5):51\u201361","journal-title":"IEEE Micro"},{"key":"5335_CR13","doi-asserted-by":"crossref","unstructured":"Dally WJ, Towles B (2001) Route packets, not wires: on-chip interconnection networks. In: Design Automation Conference, 2001. Proceedings. IEEE, pp 684\u2013689","DOI":"10.1145\/378239.379048"},{"issue":"1","key":"5335_CR14","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1109\/JSSC.2007.910957","volume":"43","author":"SR Vangal","year":"2008","unstructured":"Vangal SR, Howard J, Ruhl G, Dighe S, Wilson H, Tschanz J, Finan D, Singh A, Jacob T, Jain S et al (2008) An 80-tile sub-100-w teraflops processor in 65-nm CMOS. IEEE J Solid State Circuits 43(1):29\u201341","journal-title":"IEEE J Solid State Circuits"},{"key":"5335_CR15","doi-asserted-by":"crossref","unstructured":"Das R, Eachempati S, Mishra AK, Narayanan V, Das CR (2009) Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPS. In: 2009 IEEE 15th International Symposium on High Performance Computer Architecture. IEEE, pp 175\u2013186","DOI":"10.1109\/HPCA.2009.4798252"},{"key":"5335_CR16","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1016\/j.parco.2016.01.009","volume":"54","author":"R Ausavarungnirun","year":"2016","unstructured":"Ausavarungnirun R, Fallin C, Yu X, Chang KK-W, Nazario G, Das R, Loh GH, Mutlu O (2016) A case for hierarchical rings with deflection routing: an energy-efficient on-chip communication substrate. Parallel Comput 54:29\u201345","journal-title":"Parallel Comput"},{"issue":"1","key":"5335_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/LCA.2017.2697863","volume":"17","author":"A Scionti","year":"2018","unstructured":"Scionti A, Mazumdar S, Zuckerman S (2018) Enabling massive multi-threading with fast hashing. IEEE Comput Archit Lett 17(1):1\u20134. https:\/\/doi.org\/10.1109\/LCA.2017.2697863","journal-title":"IEEE Comput Archit Lett"},{"key":"5335_CR18","doi-asserted-by":"crossref","unstructured":"Scionti A, Mazumdar S, Portero A (2016) Software defined network-on-chip for scalable cmps. In: 2016 International Conference on High Performance Computing Simulation (HPCS). IEEE, pp 112\u2013115","DOI":"10.1109\/HPCSim.2016.7568323"},{"issue":"9","key":"5335_CR19","doi-asserted-by":"publisher","first-page":"6720","DOI":"10.1007\/s11227-019-03072-5","volume":"76","author":"S Mazumdar","year":"2020","unstructured":"Mazumdar S, Scionti A (2020) Ring-mesh: a scalable and high-performance approach for manycore accelerators. J Supercomput 76(9):6720\u20136752","journal-title":"J Supercomput"},{"key":"5335_CR20","doi-asserted-by":"crossref","unstructured":"Dennis JB, Misunas DP (1975) A preliminary architecture for a basic data-flow processor. In: ACM SIGARCH Computer Architecture News, vol 3. ACM, pp 126\u2013132","DOI":"10.1145\/641675.642111"},{"key":"5335_CR21","doi-asserted-by":"publisher","unstructured":"Papadopoulos GM, Culler DE (1990) Monsoon: an explicit token-store architecture. In: Proceedings of the 17th Annual International Symposium on Computer Architecture. ISCA \u201990. Association for Computing Machinery, New York, pp 82\u201391. https:\/\/doi.org\/10.1145\/325164.325117","DOI":"10.1145\/325164.325117"},{"key":"5335_CR22","doi-asserted-by":"publisher","first-page":"362","DOI":"10.1007\/3-540-06859-7_145","volume-title":"Programming symposium","author":"JB Dennis","year":"1974","unstructured":"Dennis JB (1974) First version of a data flow procedure language. In: Robinet B (ed) Programming symposium. Springer, Berlin, Heidelberg, pp 362\u2013376"},{"key":"5335_CR23","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1145\/69558.69562","volume":"11","author":"RS Arvind Nikhil","year":"1989","unstructured":"Arvind Nikhil RS, Pingali KK (1989) I-structures: data structures for parallel computing. ACM Trans Program Lang Syst 11:598\u2013632. https:\/\/doi.org\/10.1145\/69558.69562","journal-title":"ACM Trans Program Lang Syst"},{"issue":"9","key":"5335_CR24","doi-asserted-by":"publisher","first-page":"1305","DOI":"10.1109\/5.97300","volume":"79","author":"N Halbwachs","year":"1991","unstructured":"Halbwachs N, Caspi P, Raymond P, Pilaud D (1991) The synchronous data flow programming language LUSTRE. Proc IEEE 79(9):1305\u20131320. https:\/\/doi.org\/10.1109\/5.97300","journal-title":"Proc IEEE"},{"issue":"2","key":"5335_CR25","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1023\/A:1008052406396","volume":"21","author":"SS Bhattacharyya","year":"1999","unstructured":"Bhattacharyya SS, Murthy PK, Lee EA (1999) Synthesis of embedded software from synchronous dataflow specifications. J VLSI Signal Process 21(2):151\u2013166. https:\/\/doi.org\/10.1023\/A:1008052406396","journal-title":"J VLSI Signal Process"},{"key":"5335_CR26","doi-asserted-by":"publisher","first-page":"292","DOI":"10.1007\/s10766-009-0101-1","volume":"37","author":"A Duran","year":"2009","unstructured":"Duran A, Ferrer R, Ayguad\u00e9 E, Badia RM, Labarta J (2009) A proposal to extend the OpenMP tasking model with dependent tasks. Int J Parallel Program 37:292\u2013305. https:\/\/doi.org\/10.1007\/s10766-009-0101-1","journal-title":"Int J Parallel Program"},{"key":"5335_CR27","doi-asserted-by":"publisher","unstructured":"Nemawarkar SS, Gao GR (1996) Measurement and modeling of earth-manna multithreaded architecture. In: Proceedings of MASCOTS \u201996 - 4th International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp 109\u2013114. https:\/\/doi.org\/10.1109\/MASCOT.1996.501002","DOI":"10.1109\/MASCOT.1996.501002"},{"key":"5335_CR28","unstructured":"Theobald KB (1999) Earth: an efficient architecture: for running threads. PhD thesis, McGill University, Montr\u00e9al Qu\u00e9bec"},{"key":"5335_CR29","doi-asserted-by":"crossref","unstructured":"Vishkin U, Dascal S, Berkovich E, Nuzman J (1998) Explicit multi-threading (XMT) bridging models for instruction parallelism. In: Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms and Architectures. ACM, pp 140\u2013151","DOI":"10.1145\/277651.277680"},{"key":"5335_CR30","doi-asserted-by":"publisher","unstructured":"Pell O, Mencer O, Tsoi KH, Luk W (2013) In: Vanderbauwhede W, Benkrid K (eds) Maximum performance computing with dataflow engines. Springer, New York, pp 747\u2013774. https:\/\/doi.org\/10.1007\/978-1-4614-1791-0_25","DOI":"10.1007\/978-1-4614-1791-0_25"},{"issue":"6","key":"5335_CR31","doi-asserted-by":"publisher","first-page":"1489","DOI":"10.1109\/TPDS.2013.125","volume":"25","author":"F Yazdanpanah","year":"2014","unstructured":"Yazdanpanah F, Alvarez-Martinez C, Jimenez-Gonzalez D, Etsion Y (2014) Hybrid dataflow\/von-Neumann architectures. Parallel Distrib Syst IEEE Trans 25(6):1489\u20131509","journal-title":"Parallel Distrib Syst IEEE Trans"},{"key":"5335_CR32","doi-asserted-by":"crossref","unstructured":"Zuckerman S, Suetterlein J, Knauerhase R. Gao GR (2011) Using a codelet program execution model for exascale machines: position paper. In: Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era. ACM, pp 64\u201369","DOI":"10.1145\/2000417.2000424"},{"key":"5335_CR33","doi-asserted-by":"crossref","unstructured":"Suettlerlein J, Zuckerman S, Gao GR (2013) An implementation of the codelet model. In: Wolf F, Mohr B, an Mey D (eds) Euro-Par 2013 parallel Processing. Springer, Berlin, pp 633\u2013644","DOI":"10.1007\/978-3-642-40047-6_63"},{"issue":"1","key":"5335_CR34","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1016\/j.vlsi.2004.03.006","volume":"38","author":"E Bolotin","year":"2004","unstructured":"Bolotin E, Cidon I, Ginosar R, Kolodny A (2004) Cost considerations in network on chip. Integr VLSI J 38(1):19\u201342","journal-title":"Integr VLSI J"},{"key":"5335_CR35","doi-asserted-by":"crossref","unstructured":"Parikh R, Das R, Bertacco V (2014) Power-aware NoCS through routing and topology reconfiguration. In: 2014 51st ACM\/EDAC\/IEEE Design Automation Conference (DAC). IEEE, pp 1\u20136","DOI":"10.1109\/DAC.2014.6881489"},{"key":"5335_CR36","doi-asserted-by":"crossref","unstructured":"Murali S, De\u00a0Micheli G (2004) Sunmap: a tool for automatic topology selection and generation for NoCS. In: Proceedings of the 41st Annual Design Automation Conference. ACM, pp 914\u2013919","DOI":"10.1145\/996566.996809"},{"key":"5335_CR37","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2022.3227460","author":"R Singh","year":"2022","unstructured":"Singh R, Bohra MK, Hemrajani P, Kalla A, Bhatt DP, Purohit N, Daneshtalab M (2022) Review, analysis, and implementation of path selection strategies for 2D NoCS. IEEE Access. https:\/\/doi.org\/10.1109\/ACCESS.2022.3227460","journal-title":"IEEE Access"},{"key":"5335_CR38","doi-asserted-by":"crossref","unstructured":"Ravindran G, Stumm M (1997) A performance comparison of hierarchical ring-and mesh-connected multiprocessor networks. In: High-Performance Computer Architecture, 1997, Third International Symposium on. IEEE, pp 58\u201369","DOI":"10.1109\/HPCA.1997.569606"},{"issue":"1","key":"5335_CR39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/12.902749","volume":"50","author":"VC Hamacher","year":"2001","unstructured":"Hamacher VC, Jiang H (2001) Hierarchical ring network configuration and performance modeling. IEEE Trans Comput 50(1):1\u201312","journal-title":"IEEE Trans Comput"},{"key":"5335_CR40","doi-asserted-by":"crossref","unstructured":"Kim J, Kim H (2009) Router microarchitecture and scalability of ring topology in on-chip networks. In: Proceedings of the 2nd International Workshop on Network on Chip Architectures. ACM, pp 5\u201310","DOI":"10.1145\/1645213.1645217"},{"key":"5335_CR41","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1016\/j.jpdc.2018.09.009","volume":"123","author":"D Deb","year":"2019","unstructured":"Deb D, Jose J, Das S, Kapoor HK (2019) Cost effective routing techniques in 2D mesh NoC using on-chip transmission lines. J Parallel and Distrib Comput 123:118\u2013129","journal-title":"J Parallel and Distrib Comput"},{"key":"5335_CR42","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2022.104551","volume":"92","author":"M Manzoor","year":"2022","unstructured":"Manzoor M, Mir RN et al (2022) PAAD (partially adaptive and deterministic routing): a deadlock free congestion aware hybrid routing for 2D mesh network-on-chips. Microprocess Microsyst 92:104551","journal-title":"Microprocess Microsyst"},{"key":"5335_CR43","doi-asserted-by":"publisher","DOI":"10.1002\/dac.5360","author":"S Vazifedunn","year":"2023","unstructured":"Vazifedunn S, Reza A, Reshadi M (2023) Low-cost regional-based congestion-aware routing algorithm for 2D mesh NoC. Int J Commun Syst. https:\/\/doi.org\/10.1002\/dac.5360","journal-title":"Int J Commun Syst"},{"key":"5335_CR44","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2022.108404","author":"BNK Reddy","year":"2022","unstructured":"Reddy BNK, Kar S (2022) Performance evaluation of modified mesh-based NoC architecture. Comput Electr Eng. https:\/\/doi.org\/10.1016\/j.compeleceng.2022.108404","journal-title":"Comput Electr Eng"},{"key":"5335_CR45","doi-asserted-by":"publisher","unstructured":"Zhao J, Agrawal A, Nikolic B, Asanovi\u0107 K (2022) Constellation: an open-source SoC-capable NoC generator. In: 15th IEEE\/ACM International Workshop on Network on Chip Architectures (NoCArc), pp 1\u20137. https:\/\/doi.org\/10.1109\/NoCArc57472.2022.9911299","DOI":"10.1109\/NoCArc57472.2022.9911299"},{"issue":"4","key":"5335_CR46","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1016\/j.micpro.2015.03.008","volume":"39","author":"N Zheng","year":"2015","unstructured":"Zheng N, Gu H, Huang X, Chen X (2015) Csquare: a new kilo-core-oriented topology. Microprocess Microsyst 39(4):313\u2013320","journal-title":"Microprocess Microsyst"},{"key":"5335_CR47","doi-asserted-by":"crossref","unstructured":"Kim H, Kim G, Maeng S, Yeo H, Kim J (2014) Transportation-network-inspired network-on-chip. In: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), pp. 332\u2013343. IEEE","DOI":"10.1109\/HPCA.2014.6835943"},{"key":"5335_CR48","doi-asserted-by":"crossref","unstructured":"Koohi S, Abdollahi M, Hessabi S (2011) All-optical wavelength-routed noc based on a novel hierarchical topology. In: Proceedings of the Fifth ACM\/IEEE International Symposium on Networks-on-Chip, pp. 97\u2013104. ACM","DOI":"10.1145\/1999946.1999962"},{"key":"5335_CR49","doi-asserted-by":"crossref","unstructured":"Grot B, Hestness J, Keckler SW, Mutlu O (2011) Kilo-noc: a heterogeneous network-on-chip architecture for scalability and service guarantees. In: ACM SIGARCH Computer Architecture News. ACM, vol 39, pp 401\u2013412","DOI":"10.1145\/2024723.2000112"},{"key":"5335_CR50","doi-asserted-by":"crossref","unstructured":"Bourduas S, Zilic Z (2007) A hybrid ring\/mesh interconnect for network-on-chip using hierarchical rings for global routing. In: First International Symposium on Networks-on-Chip (NOCS\u201907). IEEE, pp 195\u2013204","DOI":"10.1109\/NOCS.2007.3"},{"key":"5335_CR51","doi-asserted-by":"crossref","unstructured":"Sandoval-Arechiga R, Parra-Michel R, Vazquez-Avila J, Flores-Troncoso J, Ibarra-Delgado S (2016) Software defined networks-on-chip for multi\/many-core systems: A performance evaluation. In: Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems. ACM, pp 129\u2013130","DOI":"10.1145\/2881025.2889474"},{"issue":"4","key":"5335_CR52","first-page":"56","volume":"10","author":"J Lee","year":"2013","unstructured":"Lee J, Nicopoulos C, Lee HG, Kim J (2013) Tornadonoc: a lightweight and scalable on-chip network architecture for the many-core era. ACM Trans Architect Code Optim (TACO) 10(4):56","journal-title":"ACM Trans Architect Code Optim (TACO)"},{"key":"5335_CR53","doi-asserted-by":"crossref","unstructured":"Chen X, Peh L-S (2003) Leakage power modeling and optimization in interconnection networks. In: Proceedings of the 2003 International Symposium on Low Power Electronics and Design. ACM, pp 90\u201395","DOI":"10.1145\/871506.871531"},{"key":"5335_CR54","unstructured":"Wang H, Peh L-S, Malik S (2003) Power-driven design of router microarchitectures in on-chip networks. In: Proceedings of the 36th Annual IEEE\/ACM International Symposium on Microarchitecture. IEEE Computer Society, p 105"},{"key":"5335_CR55","doi-asserted-by":"crossref","unstructured":"Ma S, Jerger NE, Wang Z (2012) Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip. In: IEEE International Symposium on High-Performance Comp Architecture. IEEE, pp 1\u201312","DOI":"10.1109\/HPCA.2012.6169049"},{"key":"5335_CR56","doi-asserted-by":"crossref","unstructured":"Lee J, Nicopoulos C, Park SJ, Swaminathan M, Kim J (2013) Do we need wide flits in networks-on-chip?. In: 2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, pp 2\u20137","DOI":"10.1109\/ISVLSI.2013.6654614"},{"issue":"2","key":"5335_CR57","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1109\/LES.2015.2402197","volume":"7","author":"AB Kahng","year":"2015","unstructured":"Kahng AB, Lin B, Nath S (2015) Orion3.0: a comprehensive NoC router estimation tool. IEEE Embed Syst Lett 7(2):41\u201345","journal-title":"IEEE Embed Syst Lett"},{"key":"5335_CR58","doi-asserted-by":"crossref","unstructured":"Sun C, Chen C-HO, Kurian G, Wei L, Miller J, Agarwal A, Peh L-S, Stojanovic V (2012) Dsent-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling. In: Networks on Chip (NoCS), 2012 Sixth IEEE\/ACM International Symposium on. IEEE, pp 201\u2013210","DOI":"10.1109\/NOCS.2012.31"},{"key":"5335_CR59","volume-title":"Principles and practices of interconnection networks","author":"WJ Dally","year":"2004","unstructured":"Dally WJ, Towles BP (2004) Principles and practices of interconnection networks. Morgan Kaufmann, San Francisco, USA"},{"key":"5335_CR60","doi-asserted-by":"crossref","unstructured":"Papamichael MK, Hoe JC (2012) CONNECT: re-examining conventional wisdom for designing NoCS in the context of FPGAs. In: Proceedings of the ACM\/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, pp 37\u201346","DOI":"10.1145\/2145694.2145703"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-023-05335-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-023-05335-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-023-05335-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,23]],"date-time":"2023-10-23T13:04:14Z","timestamp":1698066254000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-023-05335-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,11]]},"references-count":60,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["5335"],"URL":"https:\/\/doi.org\/10.1007\/s11227-023-05335-8","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"value":"0920-8542","type":"print"},{"value":"1573-0484","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,11]]},"assertion":[{"value":"22 April 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 October 2023","order":3,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":4,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A sentence has been modified to \"The last dataflow-inspired architectures include explicit multi-threading (XMT) architecture, which introduces an abstract execution model, where switching from serial to parallel execution is made through explicit spawn\/join instructions [29].\"","order":5,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known conflict financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}}]}}