{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,3]],"date-time":"2024-08-03T05:55:35Z","timestamp":1722664535001},"reference-count":62,"publisher":"Association for Computing Machinery (ACM)","issue":"4","funder":[{"name":"National Key Research and Development Program of China","award":["2022YFB2404202"]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["62072193"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Major Scientific Research Project of Zhejiang Lab","award":["2022PI0AC03"]},{"name":"CCF-AFSG Research Fund","award":["RF20220211"]},{"name":"Young Top-notch Talent Cultivation Program of Hubei Province, Key Research and Development Program of Hubei Province","award":["2023BAB078"]},{"name":"Knowledge Innovation Program of Wuhan-Basi Research","award":["2022013301015177"]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2023,12,31]]},"abstract":"\n Dynamic Graph Neural Network<\/jats:italic>\n (DGNN) has recently attracted a significant amount of research attention from various domains, because most real-world graphs are inherently dynamic. Despite many research efforts, for DGNN, existing hardware\/software solutions still suffer significantly from redundant computation and memory access overhead, because they need to irregularly access and recompute all graph data of each graph snapshot. To address these issues, we propose an efficient redundancy-aware accelerator,\n RACE<\/jats:italic>\n , which enables energy-efficient execution of DGNN models. Specifically, we propose a\n redundancy-aware incremental execution approach<\/jats:italic>\n into the accelerator design for DGNN to instantly achieve the output features of the latest graph snapshot by correctly and incrementally refining the output features of the previous graph snapshot and also enable regular accesses of vertices\u2019 input features. Through traversing the graph on the fly, RACE identifies the vertices that are not affected by graph updates between successive snapshots to reuse these vertices\u2019 states (i.e., their output features) of the previous snapshot for the processing of the latest snapshot. The vertices affected by graph updates are also tracked to incrementally recompute their new states using their neighbors\u2019 input features of the latest snapshot for correctness. In this way, the processing and accessing of many graph data that are not affected by graph updates can be correctly eliminated, enabling smaller redundant computation and memory access overhead. Besides, the input features, which are accessed more frequently, are dynamically identified according to graph topology and are preferentially resident in the on-chip memory for less off-chip communications. Experimental results show that RACE achieves on average 1139\u00d7 and 84.7\u00d7 speedups for DGNN inference, with average 2242\u00d7 and 234.2\u00d7 energy savings, in comparison with the state-of-the-art software DGNN running on Intel Xeon CPU and NVIDIA A100 GPU, respectively. Moreover, for DGNN inference, RACE obtains on average 13.1\u00d7, 11.7\u00d7, 10.4\u00d7, and 7.9\u00d7 speedup and 14.8\u00d7, 12.9\u00d7, 11.5\u00d7, and 8.9\u00d7 energy savings over the state-of-the-art Graph Neural Network accelerators, i.e., AWB-GCN, GCNAX, ReGNN, and I-GCN, respectively.\n <\/jats:p>","DOI":"10.1145\/3617685","type":"journal-article","created":{"date-parts":[[2023,8,30]],"date-time":"2023-08-30T09:52:01Z","timestamp":1693389121000},"page":"1-26","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["RACE: An Efficient Redundancy-aware Accelerator for Dynamic Graph Neural Network"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-6559-6111","authenticated-orcid":false,"given":"Hui","family":"Yu","sequence":"first","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-0718-8045","authenticated-orcid":false,"given":"Yu","family":"Zhang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4217-7886","authenticated-orcid":false,"given":"Jin","family":"Zhao","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0009-0008-9610-9444","authenticated-orcid":false,"given":"Yujian","family":"Liao","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0009-0006-5088-8249","authenticated-orcid":false,"given":"Zhiying","family":"Huang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0009-0000-4374-0293","authenticated-orcid":false,"given":"Donghao","family":"He","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-6525-9334","authenticated-orcid":false,"given":"Lin","family":"Gu","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-3934-7605","authenticated-orcid":false,"given":"Hai","family":"Jin","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-6302-813X","authenticated-orcid":false,"given":"Xiaofei","family":"Liao","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-4290-1408","authenticated-orcid":false,"given":"Haikun","family":"Liu","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Big Data Technology and System, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-8618-4581","authenticated-orcid":false,"given":"Bingsheng","family":"He","sequence":"additional","affiliation":[{"name":"National University of Singapore, Singapore"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-1876-6931","authenticated-orcid":false,"given":"Jianhui","family":"Yue","sequence":"additional","affiliation":[{"name":"Michigan Technological University, America"}]}],"member":"320","published-online":{"date-parts":[[2023,12,14]]},"reference":[{"key":"e_1_3_2_2_2","unstructured":"2022. Academic. Retrieved from https:\/\/west.uni-koblenz.de\/konect\/networks"},{"key":"e_1_3_2_3_2","unstructured":"2022. DBLP and Mobile. Retrieved from https:\/\/dblp.uni-trier.de\/xml\/"},{"key":"e_1_3_2_4_2","unstructured":"2022. Flicker. Retrieved from https:\/\/socialnetworks.mpi-sws.org\/data-imc2007.html"},{"key":"e_1_3_2_5_2","unstructured":"2022. Wikidata. Retrieved from https:\/\/github.com\/mniepert\/mmkb\/tree\/master\/TemporalKGs\/wikidata"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.5555\/3437539.3437703"},{"key":"e_1_3_2_7_2","first-page":"668","volume-title":"Proceedings of the 27th IEEE International Symposium on High-Performance Computer Architecture","author":"Balaji Vignesh","year":"2021","unstructured":"Vignesh Balaji, Neal Crago, Aamer Jaleel, and Brandon Lucia. 2021. P-OPT: Practical optimal cache replacement for graph analytics. In Proceedings of the 27th IEEE International Symposium on High-Performance Computer Architecture. 668\u2013681."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","first-page":"1036","DOI":"10.1145\/3466752.3480096","volume-title":"Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Basak Abanti","year":"2021","unstructured":"Abanti Basak, Zheng Qu, Jilan Lin, Alaa R. Alameldeen, Zeshan Chishti, Yufei Ding, and Yuan Xie. 2021. Improving streaming graph processing performance using input knowledge. In Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture. 1036\u20131050."},{"key":"e_1_3_2_9_2","first-page":"77:1\u201377:15","volume-title":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","author":"Chakaravarthy Venkatesan T.","year":"2021","unstructured":"Venkatesan T. Chakaravarthy, Shivmaran S. Pandian, Saurabh Raje, Yogish Sabharwal, Toyotaro Suzumura, and Shashanka Ubaru. 2021. Efficient scaling of dynamic graph neural networks. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 77:1\u201377:15."},{"key":"e_1_3_2_10_2","first-page":"1","volume-title":"Proceedings of the 28th IEEE International Symposium on High-Performance Computer Architecture","author":"Chen Cen","year":"2022","unstructured":"Cen Chen, Kenli Li, Yangfan Li, and Xiaofeng Zou. 2022. ReGNN: A redundancy-eliminated graph neural networks accelerator. In Proceedings of the 28th IEEE International Symposium on High-Performance Computer Architecture. 1\u201314."},{"issue":"1","key":"e_1_3_2_11_2","first-page":"1","article-title":"GC-LSTM: Graph convolution embedded LSTM for dynamic network link prediction","volume":"12","author":"Chen Jinyin","year":"2021","unstructured":"Jinyin Chen, Xueke Wang, and Xuanheng Xu. 2021. GC-LSTM: Graph convolution embedded LSTM for dynamic network link prediction. Appl. Intell. 12, 1 (2021), 1\u201316.","journal-title":"Appl. Intell."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/2168836.2168846"},{"key":"e_1_3_2_13_2","first-page":"234","volume-title":"Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture","author":"Faldu Priyank","year":"2020","unstructured":"Priyank Faldu, Jeff Diamond, and Boris Grot. 2020. Domain-specialized cache management for graph analytics. In Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture. 234\u2013248."},{"key":"e_1_3_2_14_2","first-page":"155","volume-title":"Proceedings of the ACM International Conference on Management of Data","author":"Fan Wenfei","year":"2017","unstructured":"Wenfei Fan, Chunming Hu, and Chao Tian. 2017. Incremental graph computations: Doable and undoable. In Proceedings of the ACM International Conference on Management of Data. 155\u2013169."},{"key":"e_1_3_2_15_2","first-page":"922","volume-title":"Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Geng Tong","year":"2020","unstructured":"Tong Geng, Ang Li, Runbin Shi, Chunshu Wu, Tianqi Wang, Yanfei Li, Pouya Haghi, Antonino Tumeo, Shuai Che, Steven K. Reinhardt, and Martin C. Herbordt. 2020. AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing. In Proceedings of the 53rd Annual IEEE\/ACM International Symposium on Microarchitecture. 922\u2013936."},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","first-page":"1051","DOI":"10.1145\/3466752.3480113","volume-title":"Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Geng Tong","year":"2021","unstructured":"Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, and Ang Li. 2021. I-GCN: A graph convolutional network accelerator with runtime locality enhancement through islandization. In Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture. 1051\u20131063."},{"key":"e_1_3_2_17_2","first-page":"17","volume-title":"Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation","author":"Gonzalez Joseph E.","year":"2012","unstructured":"Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 17\u201330."},{"key":"e_1_3_2_18_2","first-page":"56:1\u201356:13","volume-title":"Proceedings of the 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916)","author":"Ham Tae Jun","year":"2016","unstructured":"Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, and Margaret Martonosi. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In Proceedings of the 49th Annual IEEE\/ACM International Symposium on Microarchitecture (MICRO\u201916). 56:1\u201356:13."},{"key":"e_1_3_2_19_2","first-page":"60","volume-title":"Proceedings of the 37th International Symposium on Computer Architecture","author":"Jaleel Aamer","year":"2010","unstructured":"Aamer Jaleel, Kevin B. Theobald, Simon C. Steely Jr., and Joel S. Emer. 2010. High performance cache replacement using re-reference interval prediction. In Proceedings of the 37th International Symposium on Computer Architecture. 60\u201371."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/2540708.2540733"},{"issue":"5","key":"e_1_3_2_21_2","first-page":"70:1\u201370:73","article-title":"Representation learning for dynamic graphs: A survey","volume":"21","author":"Kazemi Seyed Mehran","year":"2020","unstructured":"Seyed Mehran Kazemi, Rishab Goel, Kshitij Jain, Ivan Kobyzev, Akshay Sethi, Peter Forsyth, and Pascal Poupart. 2020. Representation learning for dynamic graphs: A survey. J. Mach. Learn. Res. 21, 5 (2020), 70:1\u201370:73.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2015.2414456"},{"key":"e_1_3_2_23_2","first-page":"1","volume-title":"Proceedings of the 5th International Conference on Learning Representations","author":"Kipf Thomas N.","year":"2017","unstructured":"Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations. 1\u201314."},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","unstructured":"Ekkehard K\u00f6hler Rolf H. M\u00f6hring and Martin Skutella. 2009. Traffic networks and flows over time. In Algorithmics of Large and Complex Networks . Springer 166\u2013196.","DOI":"10.1007\/978-3-642-02094-0_9"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2019.8737631"},{"key":"e_1_3_2_26_2","first-page":"775","volume-title":"Proceedings of the IEEE International Symposium on High-Performance Computer Architecture","author":"Li Jiajun","year":"2021","unstructured":"Jiajun Li, Ahmed Louri, Avinash Karanth, and Razvan C. Bunescu. 2021. GCNAX: A flexible and energy-efficient accelerator for graph convolutional neural networks. In Proceedings of the IEEE International Symposium on High-Performance Computer Architecture. 775\u2013788."},{"key":"e_1_3_2_27_2","first-page":"72:1\u201372:9","volume-title":"Proceedings of the IEEE\/ACM International Conference On Computer Aided Design","author":"Liang Shengwen","year":"2020","unstructured":"Shengwen Liang, Cheng Liu, Ying Wang, Huawei Li, and Xiaowei Li. 2020. DeepBurning-GL: An automated framework for generating graph neural network accelerators. In Proceedings of the IEEE\/ACM International Conference On Computer Aided Design. 72:1\u201372:9."},{"issue":"4","key":"e_1_3_2_28_2","first-page":"42202:1\u201342202:1","article-title":"Graph partitions and the controllability of directed signed networks","volume":"62","author":"Liu Xianzhu","year":"2019","unstructured":"Xianzhu Liu, Zhijian Ji, and Ting Hou. 2019. Graph partitions and the controllability of directed signed networks. Sci. Chin. Inf. Sci. 62, 4 (2019), 42202:1\u201342202:11.","journal-title":"Sci. Chin. Inf. Sci."},{"key":"e_1_3_2_29_2","first-page":"729","volume-title":"Proceedings of the SIAM International Conference on Data Mining","author":"Malik Osman Asif","year":"2021","unstructured":"Osman Asif Malik, Shashanka Ubaru, Lior Horesh, Misha E. Kilmer, and Haim Avron. 2021. Dynamic graph convolutional networks using the tensor M-product. In Proceedings of the SIAM International Conference on Data Mining. 729\u2013737."},{"issue":"1","key":"e_1_3_2_30_2","first-page":"1","article-title":"Dynamic graph convolutional networks","volume":"97","author":"Manessi Franco","year":"2020","unstructured":"Franco Manessi, Alessandro Rozza, and Mario Manzo. 2020. Dynamic graph convolutional networks. Pattern Recogn. 97, 1 (2020), 1\u201318.","journal-title":"Pattern Recogn."},{"key":"e_1_3_2_31_2","first-page":"83","volume-title":"Proceedings of the 16th European Conference on Computer Systems","author":"Mariappan Mugilan","year":"2021","unstructured":"Mugilan Mariappan, Joanna Che, and Keval Vora. 2021. DZiG: Sparsity-aware incremental processing of streaming graphs. In Proceedings of the 16th European Conference on Computer Systems. 83\u201398."},{"key":"e_1_3_2_32_2","first-page":"25:1\u201325:16","volume-title":"Proceedings of the 14th European Conference on Computer Systems","author":"Mariappan Mugilan","year":"2019","unstructured":"Mugilan Mariappan and Keval Vora. 2019. GraphBolt: Dependency-driven synchronous processing of streaming graphs. In Proceedings of the 14th European Conference on Computer Systems. 25:1\u201325:16."},{"key":"e_1_3_2_33_2","first-page":"15","volume-title":"Proceedings of the 56th Annual Design Automation Conference","author":"McCrabb Andrew","year":"2019","unstructured":"Andrew McCrabb, Eric Winsor, and Valeria Bertacco. 2019. DREDGE: Dynamic repartitioning during dynamic graph execution. In Proceedings of the 56th Annual Design Automation Conference. 15\u201328."},{"issue":"5","key":"e_1_3_2_34_2","first-page":"1","article-title":"Learning hyperspectral images from RGB images via a coarse-to-fine CNN","volume":"65","author":"Mei Shaohui","year":"2022","unstructured":"Shaohui Mei, Yunhao Geng, Junhui Hou, and Qian Du. 2022. Learning hyperspectral images from RGB images via a coarse-to-fine CNN. Sci. Chin. Inf. Sci. 65, 5 (2022), 1\u201314.","journal-title":"Sci. Chin. Inf. Sci."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2010-343"},{"key":"e_1_3_2_36_2","first-page":"1","volume-title":"Proceedings of the 51th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Mukkara Anurag","year":"2018","unstructured":"Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, and Daniel S\u00e1nchez. 2018. Exploiting locality in graph analytics through hardware-accelerated traversal scheduling. In Proceedings of the 51th Annual IEEE\/ACM International Symposium on Microarchitecture. 1\u201314."},{"issue":"1","key":"e_1_3_2_37_2","first-page":"1","article-title":"CACTI 6.0: A tool to model large caches","volume":"27","author":"Muralimanohar Naveen","year":"2016","unstructured":"Naveen Muralimanohar, Rajeev Balasubramonian, and Norman P Jouppi. 2016. CACTI 6.0: A tool to model large caches. HP Lab. 27, 1 (2016), 1\u201328.","journal-title":"HP Lab."},{"key":"e_1_3_2_38_2","first-page":"118","volume-title":"Proceedings of the IEEE International Conference on Information Communication and Signal Processing","author":"Nguyen Huy Trung","year":"2018","unstructured":"Huy Trung Nguyen, Quoc Dung Ngo, and Van Hoang Le. 2018. IoT botnet detection approach Based on PSI graph and DGCNN classifier. In Proceedings of the IEEE International Conference on Information Communication and Signal Processing. 118\u2013122."},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5984"},{"key":"e_1_3_2_40_2","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1145\/3466752.3480126","volume-title":"Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture","author":"Rahman Shafiur","year":"2021","unstructured":"Shafiur Rahman, Mahbod Afarin, Nael B. Abu-Ghazaleh, and Rajiv Gupta. 2021. JetStream: Graph analytics on streaming data with event-driven hardware accelerator. In Proceedings of the 54th Annual IEEE\/ACM International Symposium on Microarchitecture. 1091\u20131105."},{"key":"e_1_3_2_41_2","first-page":"319","volume-title":"Proceedings of the 22nd International Conference on Parallel and Distributed Computing","author":"Sengupta Dipanjan","year":"2016","unstructured":"Dipanjan Sengupta, Narayanan Sundaram, Xia Zhu, Theodore L. Willke, Jeffrey S. Young, Matthew Wolf, and Karsten Schwan. 2016. GraphIn: An online high performance incremental graph processing framework. In Proceedings of the 22nd International Conference on Parallel and Distributed Computing. 319\u2013333."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1504\/IJGUC.2016.077491"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cell.2020.01.021"},{"key":"e_1_3_2_44_2","first-page":"87","volume-title":"Proceedings of the IEEE International Symposium on Workload Characterization","author":"Talati Nishil","year":"2021","unstructured":"Nishil Talati, Di Jin, Haojie Ye, Ajay Brahmakshatriya, Ganesh Dasika, Saman Amarasinghe, Trevor Mudge, Danai Koutra, and Ronald Dreslinski. 2021. A deep dive into understanding the random walk-based temporal graph learning. In Proceedings of the IEEE International Symposium on Workload Characterization. 87\u2013100."},{"key":"e_1_3_2_45_2","first-page":"1270","volume-title":"Proceedings of the 55th IEEE\/ACM International Symposium on Microarchitecture","author":"Talati Nishil","year":"2022","unstructured":"Nishil Talati, Haojie Ye, Sanketh Vedula, Kuan-Yu Chen, Yuhan Chen, Daniel Liu, Yichao Yuan, David Blaauw, Alex Bronstein, Trevor Mudge, et\u00a0al. 2022. Mint: An accelerator for mining temporal motifs. In Proceedings of the 55th IEEE\/ACM International Symposium on Microarchitecture. 1270\u20131287."},{"key":"e_1_3_2_46_2","first-page":"269","volume-title":"Proceedings of theUSENIX Annual Technical Conference","author":"Vaziri Pourya","year":"2021","unstructured":"Pourya Vaziri and Keval Vora. 2021. Controlling memory footprint of stateful streaming graph processing. In Proceedings of theUSENIX Annual Technical Conference. 269\u2013283."},{"key":"e_1_3_2_47_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Velickovic Petar","year":"2017","unstructured":"Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, Yoshua Bengio, et\u00a0al. 2017. Graph attention networks. In Proceedings of the International Conference on Learning Representations. 1\u201312."},{"key":"e_1_3_2_48_2","first-page":"237","volume-title":"Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems","author":"Vora Keval","year":"2017","unstructured":"Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems. 237\u2013251."},{"key":"e_1_3_2_49_2","article-title":"Deep graph library: Towards efficient and scalable deep learning on graphs","author":"Wang Minjie","year":"2019","unstructured":"Minjie Wang, Lingfan Yu, Da Zheng, Quan Gan, Yu Gai, Zihao Ye, Mufei Li, Jinjing Zhou, Qi Huang, Chao Ma, Ziyue Huang, Qipeng Guo, Hao Zhang, Haibin Lin, Junbo Zhao, Jinyang Li, Alexander J. Smola, and Zheng Zhang. 2019. Deep graph library: Towards efficient and scalable deep learning on graphs. arXiv:1909.01315. Retreived from https:\/\/arxiv.org\/abs\/1909.01315","journal-title":"arXiv:1909.01315"},{"key":"e_1_3_2_50_2","first-page":"149","volume-title":"Proceedings of the ACM\/SIGDA International Symposium on Field Programmable Gate Arrays","author":"Wang Qinggang","year":"2021","unstructured":"Qinggang Wang, Long Zheng, Yu Huang, Pengcheng Yao, Chuangyi Gui, Xiaofei Liao, Hai Jin, Wenbin Jiang, and Fubing Mao. 2021. GraSU: A fast graph update library for fpga-based dynamic graph processing. In Proceedings of the ACM\/SIGDA International Symposium on Field Programmable Gate Arrays. 149\u2013159."},{"key":"e_1_3_2_51_2","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1007\/978-981-15-8135-9_6","volume-title":"Proceedings of the 13th Advanced Computer Architecture","volume":"1256","author":"Wang Zhao","year":"2020","unstructured":"Zhao Wang, Yijin Guan, Guangyu Sun, Dimin Niu, Yuhao Wang, Hongzhong Zheng, and Yinhe Han. 2020. GNN-PIM: A processing-in-memory architecture for graph neural networks. In Proceedings of the 13th Advanced Computer Architecture, Vol. 1256. 73\u201386."},{"key":"e_1_3_2_52_2","first-page":"15","volume-title":"Proceedings of the IEEE International Symposium on High Performance Computer Architecture","author":"Yan Mingyu","year":"2020","unstructured":"Mingyu Yan, Lei Deng, Xing Hu, Ling Liang, Yujing Feng, Xiaochun Ye, Zhimin Zhang, Dongrui Fan, and Yuan Xie. 2020. HyGCN: A GCN accelerator with hybrid architecture. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture. 15\u201329."},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2021.3090954"},{"key":"e_1_3_2_54_2","first-page":"15","volume-title":"Proceedings of the 28th IEEE International Symposium on High-Performance Computer Architecture","author":"You Haoran","year":"2022","unstructured":"Haoran You, Tong Geng, Yongan Zhang, Ang Li, and Yingyan Lin. 2022. GCoD: Graph convolutional network acceleration via dedicated algorithm and accelerator co-design. In Proceedings of the 28th IEEE International Symposium on High-Performance Computer Architecture. 15\u201327."},{"key":"e_1_3_2_55_2","first-page":"255","volume-title":"Proceedings ot the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays","author":"Zeng Hanqing","year":"2020","unstructured":"Hanqing Zeng and Viktor K. Prasanna. 2020. GraphACT: Accelerating GCN training on CPU-FPGA heterogeneous platforms. In Proceedings ot the ACM\/SIGDA International Symposium on Field-Programmable Gate Arrays. 255\u2013265."},{"key":"e_1_3_2_56_2","first-page":"29","volume-title":"Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines","author":"Zhang Bingyi","year":"2021","unstructured":"Bingyi Zhang, Rajgopal Kannan, and Viktor K. Prasanna. 2021. BoostGCN: A framework for optimizing GCN inference on FPGA. In Proceedings of the 29th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines. 29\u201339."},{"key":"e_1_3_2_57_2","first-page":"567","volume-title":"Proceeding of the 48th ACM\/IEEE Annual International Symposium on Computer Architecture","author":"Zhang Xingyao","year":"2021","unstructured":"Xingyao Zhang, Haojun Xia, Donglin Zhuang, Hao Sun, Xin Fu, Michael B. Taylor, and Shuaiwen Leon Song. 2021. \\(\\eta\\) -LSTM: Co-designing highly-efficient large LSTM training via exploiting memory-saving and architectural design opportunities. In Proceeding of the 48th ACM\/IEEE Annual International Symposium on Computer Architecture. 567\u2013580."},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2013.235"},{"key":"e_1_3_2_59_2","first-page":"1","volume-title":"Proceedings to the IEEE\/ACM International Conference On Computer Aided Design","author":"Zhang Yongan","year":"2021","unstructured":"Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, and Yingyan Lin. 2021. G-CoS: GNN-accelerator co-search towards both better accuracy and efficiency. In Proceedings to the IEEE\/ACM International Conference On Computer Aided Design. 1\u20139."},{"key":"e_1_3_2_60_2","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1145\/3470496.3527409","volume-title":"Proceedings of the 49th Annual International Symposium on Computer Architecture","author":"Zhao Jin","year":"2022","unstructured":"Jin Zhao, Yun Yang, Yu Zhang, Xiaofei Liao, Lin Gu, Ligang He, Bingsheng He, Hai Jin, Haikun Liu, Xinyu Jiang, and Hui Yu. 2022. TDGraph: A topology-driven accelerator for high-performance streaming graph processing. In Proceedings of the 49th Annual International Symposium on Computer Architecture. 116\u2013129."},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.14778\/3529337.3529342"},{"key":"e_1_3_2_62_2","first-page":"1009","volume-title":"Proceedings of the 58th ACM\/IEEE Design Automation Conference","author":"Zhou Zhe","year":"2021","unstructured":"Zhe Zhou, Bizhao Shi, Zhe Zhang, Yijin Guan, Guangyu Sun, and Guojie Luo. 2021. BlockGNN: Towards efficient GNN acceleration using block-circulant weight matrices. In Proceedings of the 58th ACM\/IEEE Design Automation Conference. 1009\u20131014."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.14778\/3352063.3352127"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3617685","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,14]],"date-time":"2023-12-14T12:16:43Z","timestamp":1702556203000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3617685"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,14]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,12,31]]}},"alternative-id":["10.1145\/3617685"],"URL":"https:\/\/doi.org\/10.1145\/3617685","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"value":"1544-3566","type":"print"},{"value":"1544-3973","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,14]]},"assertion":[{"value":"2022-10-21","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-08-04","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}