{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,9,9]],"date-time":"2023-09-09T14:33:15Z","timestamp":1694269995696},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"name":"Government of India vide","award":["ECR\/2016\/000212"]},{"name":"Department of Science and Technology"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2021,5,31]]},"abstract":"Prefetching helps in reducing the memory access latency in multi-banked NUCA architecture, where the Last Level Cache (LLC) is shared. In such systems, an application running on core generates significant traffic on the shared resources, the underlying network and LLC. While prefetching helps to increase application performance, but an inaccurate prefetcher can cause harm by generating unwanted traffic that additionally increases network and LLC contention. Increased network contention results in untimely prefetching of cache blocks, thereby reducing the effectiveness of a prefetcher. Prefetch accuracy is extensively used to reduce unwanted prefetches that can mitigate the prefetcher caused contention. However, the conventional prefetch accuracy parameter has major limitations in NUCA architectures. The article exposes that prefetch accuracy can create two major false-positive cases of prefetching, Under-estimation and Over-estimation problems, and false feedback loop that can mislead a prefetcher in generating more unwanted traffic. We propose a novel technique, Coordinated Prefetching for Efficient (COPE), which addresses these issues by redefining prefetch accuracy for such architectures and identifies additional parameters that can avoid generating unwanted prefetch requests. Experiment conducted using PARSEC benchmark on a 64-core system shows that COPE achieve 3% reduction in L1 cache miss rate, 12.64% improvement in IPC, 23.2% reduction in average packet latency and 18.56% reduction in dynamic power consumption of the underlying network.<\/jats:p>","DOI":"10.1145\/3428149","type":"journal-article","created":{"date-parts":[[2020,12,31]],"date-time":"2020-12-31T18:41:56Z","timestamp":1609440116000},"page":"1-31","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["COPE"],"prefix":"10.1145","volume":"26","author":[{"given":"Dipika","family":"Deb","sequence":"first","affiliation":[{"name":"Indian Institute of Technology Guwahati, Assam, India"}]},{"given":"John","family":"Jose","sequence":"additional","affiliation":[{"name":"Indian Institute of Technology Guwahati, Assam, India"}]},{"given":"Maurizio","family":"Palesi","sequence":"additional","affiliation":[{"name":"University of Catania, Catania, Italy"}]}],"member":"320","published-online":{"date-parts":[[2020,12,31]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software. 33--42","author":"Agarwal N."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 28th Annual International Symposium on Computer Architecture. 144--154","author":"Lai An-Chow"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.976921"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 46th International Symposium on Computer Architecture. 1--13","author":"Bhatia Eshan"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2024716.2024718"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the IEEE International Symposium on Performance Analysis of Systems Software. 203--212","author":"Cain H. W."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/71.395402"},{"key":"e_1_2_1_9_1","volume-title":"Proceedings of the 38th Design Automation Conference. 684--689","author":"Dally W. J."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2018.09.009"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 38th Annual International Symposium on Computer Architecture. 141--152","author":"Ebrahimi E."},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 42nd Annual IEEE\/ACM International Symposium on Microarchitecture. 316--326","author":"Ebrahimi Eiman"},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the IEEE 15th International Symposium on High Performance Computer Architecture. 7--17","author":"Ebrahimi E."},{"key":"e_1_2_1_14_1","volume-title":"Wenisch","author":"Falsafi Babak","year":"2014"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3243176.3243181"},{"key":"e_1_2_1_16_1","volume-title":"Proceedings of the 19th Annual International Conference on Supercomputing. 31--40","author":"Huh Jaehyuk"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 18th Annual International Conference on Supercomputing. 1--11","author":"Iacobovici Sorin"},{"key":"e_1_2_1_18_1","volume-title":"Proceedings of the 1st JILP Data Prefetching Championship (DPC-1\u201909)","author":"Ishii Yasuo","year":"2009"},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of the International Symposium on High Performance Computer Architecture. 39--50","author":"Jimenez V."},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques. 137--146","author":"Jim\u00e9nez Victor"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/325164.325162"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the Design, Automation Test in Europe Conference Exhibition. 423--428","author":"Kahng A. B."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/635506.605420"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 49th Annual IEEE\/ACM International Symposium on Microarchitecture. 1--12","author":"Kim J."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3093336.3037701"},{"key":"e_1_2_1_26_1","article-title":"When prefetching works, when it doesn\u2019t, and why","volume":"9","author":"Lee Jaekyu","year":"2012","journal-title":"ACM Trans. Arch. Code Optim."},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. 56:1--56:10","author":"Hung Tzu-Han","year":"2009"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2597652.2597660"},{"key":"e_1_2_1_29_1","doi-asserted-by":"crossref","unstructured":"P. Michaud. 2016. Best-offset hardware prefetching. In High Performance Computer Architecture. 469--480. P. Michaud. 2016. Best-offset hardware prefetching. In High Performance Computer Architecture. 469--480.","DOI":"10.1109\/HPCA.2016.7446087"},{"key":"e_1_2_1_30_1","volume-title":"A survey of recent prefetching techniques for processor caches. Comput. Surv. 49, 2","author":"Mittal Sparsh","year":"2016"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 44th Annual IEEE\/ACM International Symposium on Microarchitecture. 374--385","author":"Muralidhara S. P."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. 441--442","author":"Nachiappan N. C."},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the 10th International Symposium on High Performance Computer Architecture. 96--96","author":"Nesbit K. J."},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/514191.514219"},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the IEEE 20th International Symposium on High Performance Computer Architecture. 626--637","author":"Pugsley S. H."},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques. 355--366","author":"Seshadri Vivek"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2677956"},{"key":"e_1_2_1_38_1","volume-title":"Proceedings of the 33rd International Symposium on Computer Architecture. 252--263","author":"Somogyi S."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/LES.2019.2897766"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the International Symposium on High Performance Computer Architecture. 63--74","author":"Srinath S."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 11th International Workshop on Network on Chip Architectures. 1--6.","author":"Stroobant P."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1147\/rd.461.0005"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSSC.2007.910957"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/3408352.3408405"},{"key":"e_1_2_1_45_1","volume-title":"Proceedings of the 44th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201911)","author":"Wu C."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/216585.216588"},{"key":"e_1_2_1_47_1","volume-title":"Meeting midway: Improving CMP performance with memory-side prefetching. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. 289--298","author":"Yedlapalli P."},{"key":"e_1_2_1_48_1","volume-title":"Proceedings of the IEEE 32nd International Conference on Computer Design. 278--285","author":"Yu J."},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/335231.335247"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3428149","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T21:58:13Z","timestamp":1672610293000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428149"}},"subtitle":["Reducing Cache Pollution and Network Contention by Inter-tile Coordinated Prefetching in NoC-based MPSoCs"],"short-title":[],"issued":{"date-parts":[[2020,12,31]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,5,31]]}},"alternative-id":["10.1145\/3428149"],"URL":"https:\/\/doi.org\/10.1145\/3428149","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"value":"1084-4309","type":"print"},{"value":"1557-7309","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12,31]]},"assertion":[{"value":"2020-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-12-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}