{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2022,4,4]],"date-time":"2022-04-04T07:33:21Z","timestamp":1649057601812},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2014,2,12]],"date-time":"2014-02-12T00:00:00Z","timestamp":1392163200000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computing"],"published-print":{"date-parts":[[2014,6]]},"DOI":"10.1007\/s00607-014-0387-8","type":"journal-article","created":{"date-parts":[[2014,2,11]],"date-time":"2014-02-11T10:55:38Z","timestamp":1392116138000},"page":"545-564","source":"Crossref","is-referenced-by-count":7,"title":["Potential thread-level-parallelism exploration with superblock reordering"],"prefix":"10.1007","volume":"96","author":[{"given":"John","family":"Ye","sequence":"first","affiliation":[]},{"given":"Hui","family":"Yan","sequence":"additional","affiliation":[]},{"given":"Honglun","family":"Hou","sequence":"additional","affiliation":[]},{"given":"Tianzhou","family":"Chen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2014,2,12]]},"reference":[{"key":"387_CR1","series-title":"Lecture notes in computer science","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/978-3-642-21487-5_4","volume-title":"OpenMP in the Petascale Era","author":"V Basupalli","year":"2011","unstructured":"Basupalli V, Yuki T, Rajopadhye S, Morvan A, Derrien S, Quinton P, Wonnacott D (2011) ompVerify: polyhedral analysis for the OpenMP programmer. In: Chapman B, Gropp W, Kumaran K, M\u00fcller M (eds) OpenMP in the Petascale Era, vol 6665., Lecture notes in computer scienceSpringer, Berlin, pp 37\u201353"},{"key":"387_CR2","doi-asserted-by":"crossref","unstructured":"Blake G, Dreslinski RG, Mudge T, Flautner K (2010) Evolution of thread-level parallelism in desktop applications. In: ISCA \u201910: Proceedings of the 37th annual international symposium on Computer architecture, ACM, New York, pp 302\u2013313","DOI":"10.1145\/1815961.1816000"},{"key":"387_CR3","doi-asserted-by":"crossref","unstructured":"Hammacher C, Streit K, Hack S, Zeller A (2009) Profiling java programs for parallelism. In: Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering., IWMSE \u201909IEEE Computer Society, Washington DC, pp 49\u201355","DOI":"10.1109\/IWMSE.2009.5071383"},{"issue":"2","key":"387_CR4","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1109\/40.848474","volume":"20","author":"L Hammond","year":"2000","unstructured":"Hammond L, Hubbert BA, Siu M, Prabhu MK, Chen M, Olukolun K (2000) The stanford hydra CMP. IEEE Micro 20(2):71\u201384","journal-title":"IEEE Micro"},{"issue":"1\u20132","key":"387_CR5","first-page":"229","volume":"7","author":"WMW Hwu","year":"1993","unstructured":"Hwu WMW, Mahlke SA, Chen WY, Chang PP, Warter NJ, Bringmann RA, Ouellette RG, Hank RE, Kiyohara T, Haab GE, Holm JG, Lavery DM (1993) The superblock: an effective technique for VLIW and superscalar compilation. J Supercomput 7(1\u20132):229\u2013248","journal-title":"J Supercomput"},{"key":"387_CR6","doi-asserted-by":"crossref","unstructured":"Islam MM (2007) On the limitations of compilers to exploit thread-Level parallelism in embedded applications. Computer and information science, ACIS international conference, pp 60\u201366","DOI":"10.1109\/ICIS.2007.142"},{"key":"387_CR7","unstructured":"Johnson M (1991) Superscalar microprocessor design"},{"key":"387_CR8","doi-asserted-by":"crossref","unstructured":"Jones D, Marlow S, Singh S (2009) Parallel performance tuning for haskell. In: Proceedings of the 2nd ACM SIGPLAN symposium on Haskell, Haskell \u201909ACM, New York, pp 81\u201392","DOI":"10.1145\/1596638.1596649"},{"key":"387_CR9","unstructured":"Nethercote N (2004) Dynamic binary analysis and instrumentation. Ph.D. thesis, Univ. of Cambridge, Cambridge"},{"issue":"6","key":"387_CR10","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1145\/1273442.1250746","volume":"42","author":"N Nethercote","year":"2007","unstructured":"Nethercote N, Seward J (2007) Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Not 42(6):89\u2013100","journal-title":"SIGPLAN Not"},{"key":"387_CR11","doi-asserted-by":"crossref","unstructured":"Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. In: SIGGRAPH \u201908: ACM SIGGRAPH 2008 classes, ACM, New York, pp 1\u201314","DOI":"10.1145\/1401132.1401152"},{"key":"387_CR12","doi-asserted-by":"crossref","unstructured":"Ooi CL, Kim SW, Park I, Eigenmann R, Falsafi B, Vijaykumar TN (2001) Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor. In: Proceedings of the 15th international conference on supercomputing, ICS \u201901ACM, New York, NY, pp 368\u2013380","DOI":"10.1145\/377792.377863"},{"key":"387_CR13","doi-asserted-by":"crossref","unstructured":"Ottoni G, Rangan R, Stoler A, August DI (2005) Automatic thread extraction with decoupled software pipelining. In: Proceedings of the 38th annual IEEE\/ACM international symposium on microarchitecture, MICRO 38IEEE computer society, Washington DC, pp 105\u2013118","DOI":"10.1109\/MICRO.2005.13"},{"key":"387_CR14","unstructured":"Packirisamy V, Zhai A, Chung Hsu W, Chung Yew P (2009) Fook Ngai T Exploring speculative parallelism in SPEC2006. In: international symposium on performance analysis of systems and software, pp 77\u201388"},{"key":"387_CR15","doi-asserted-by":"crossref","unstructured":"Schaumont PR (2010) Analysis of control flow and data flow a practical introduction to hardware\/software codesign. A practical introduction to hardware\/software codesign, chap., 3 Springer, Boston, pp 71\u201391","DOI":"10.1007\/978-1-4419-6000-9_3"},{"key":"387_CR16","doi-asserted-by":"crossref","unstructured":"Shobaki G, Wilken KD (2004) Optimal superblock scheduling using enumeration. In: international symposium on microarchitecture, pp 283\u2013293","DOI":"10.1109\/MICRO.2004.27"},{"key":"387_CR17","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1145\/225830.224451","volume":"23","author":"GS Sohi","year":"1995","unstructured":"Sohi GS, Breach SE, Vijaykumar TN (1995) Multiscalar processors. SIGARCH Comput Archit News 23:414\u2013425","journal-title":"SIGARCH Comput Archit News"},{"key":"387_CR18","doi-asserted-by":"crossref","unstructured":"Terboven C, An Sarholz S (2008) OpenMP on multicore architectures. In: A practical programming model for the multi-core era, pp 54\u201364","DOI":"10.1007\/978-3-540-69303-1_5"},{"key":"387_CR19","doi-asserted-by":"crossref","unstructured":"Thornton JE (1965) Parallel operation in the control data 6600. In: Proceedings of the October 27\u201329., 1964, fall joint computer conference, Part II: very high speed computer systems, AFIPS \u201964 (Fall, part II) ACM, New York, pp 33\u201340","DOI":"10.1145\/1464039.1464045"},{"issue":"1","key":"387_CR20","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1147\/rd.111.0025","volume":"11","author":"RM Tomasulo","year":"1967","unstructured":"Tomasulo RM (1967) An efficient algorithm for exploiting multiple arithmetic units. IBM J Res Dev 11(1):25\u201333","journal-title":"IBM J Res Dev"},{"issue":"2","key":"387_CR21","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1109\/MM.2011.24","volume":"31","author":"CM Wittenbrink","year":"2011","unstructured":"Wittenbrink CM, Kilgariff E, Prabhu A (2011) Fermi GF100 GPU architecture. Micro IEEE 31(2):50\u201359","journal-title":"Micro IEEE"},{"key":"387_CR22","unstructured":"Ye J (2012) Potential parallelism analysis tool of sequential programs [source code]. https:\/\/github.com\/zjutoe\/fat"},{"key":"387_CR23","doi-asserted-by":"crossref","unstructured":"Ye J, Chen T (2012) Exploring potential parallelism of sequential programs with superblock reordering. In: IEEE HPCC-2012","DOI":"10.1109\/HPCC.2012.12"},{"key":"387_CR24","unstructured":"Zhong H, Mehrara M, Lieberman SA, Mahlke SA (2008) Uncovering hidden loop level parallelism in sequential applications. In: International symposium on high-performance computer, architecture, pp 290\u2013301"},{"issue":"1","key":"387_CR25","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1109\/TPDS.2009.47","volume":"21","author":"DA Zier","year":"2010","unstructured":"Zier DA, Lee B (2010) Performance evaluation of dynamic speculative multithreading with the cascadia architecture. IEEE Tran Parallel Distrib Syst 21(1):47\u201359","journal-title":"IEEE Tran Parallel Distrib Syst"}],"container-title":["Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00607-014-0387-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00607-014-0387-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00607-014-0387-8","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,7]],"date-time":"2019-08-07T14:24:00Z","timestamp":1565187840000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00607-014-0387-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,2,12]]},"references-count":25,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2014,6]]}},"alternative-id":["387"],"URL":"https:\/\/doi.org\/10.1007\/s00607-014-0387-8","relation":{},"ISSN":["0010-485X","1436-5057"],"issn-type":[{"value":"0010-485X","type":"print"},{"value":"1436-5057","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,2,12]]}}}