{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,2]],"date-time":"2024-07-02T13:50:04Z","timestamp":1719928204842},"reference-count":55,"publisher":"IOP Publishing","issue":"1","license":[{"start":{"date-parts":[[2024,3,1]],"date-time":"2024-03-01T00:00:00Z","timestamp":1709251200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,3,1]],"date-time":"2024-03-01T00:00:00Z","timestamp":1709251200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Neuromorph. Comput. Eng."],"published-print":{"date-parts":[[2024,3,1]]},"abstract":"Abstract<\/jats:title>\n The demand for computation driven by machine learning and deep learning applications has experienced exponential growth over the past five years (Sevilla et al<\/jats:italic> 2022 2022 International Joint Conference on Neural Networks (IJCNN)<\/jats:italic> (IEEE) pp 1-8), leading to a significant surge in computing hardware products. Meanwhile, this rapid increase has exacerbated the memory wall bottleneck within mainstream Von Neumann architectures (Hennessy and Patterson et al<\/jats:italic> 2011 Computer architecture: a quantitative approach<\/jats:italic> (Elsevier)). For instance, NVIDIA graphical processing units (GPUs) have gained nearly a 200x increase in fp32 computing power, transitioning from P100 to H100 in the last five years (NVIDIA Tesla P100 2023 (www.nvidia.com\/en-us\/data-center\/tesla-p100\/<\/jats:ext-link>); NVIDIA H100 Tensor Core GPU 2023 (www.nvidia.com\/en-us\/data-center\/h100\/<\/jats:ext-link>)), accompanied by a mere 8x scaling in memory bandwidth. Addressing the need to mitigate data movement challenges, process-in-memory designs, especially resistive random-access memory (ReRAM)-based solutions, have emerged as compelling candidates (Verma et al<\/jats:italic> 2019 IEEE Solid-State Circuits Mag.<\/jats:italic>\n 11<\/jats:bold> 43\u201355; Sze et al<\/jats:italic> 2017 Proc. IEEE<\/jats:italic>\n 105<\/jats:bold> 2295\u2013329). However, this shift in hardware design poses distinct challenges at the design phase, given the limitations of existing hardware design tools. Popular design tools today can be used to characterize analog behavior via SPICE tools (PrimeSim HSPICE 2023 (www.synopsys.com\/implementation-and-signoff\/ams-simulation\/primesim-hspice.html<\/jats:ext-link>)), system and logical behavior using Verilog tools (VCS 2023 (www.synopsys.com\/verification\/simulation\/vcs.html<\/jats:ext-link>)), and mixed signal behavior through toolbox like CPPSIM (Meninger 2023 (www.cppsim.org\/Tutorials\/wideband_fracn_tutorial.pdf<\/jats:ext-link>)). Nonetheless, the design of in-memory computing systems, especially those involving non-CMOS devices, presents a unique need for characterizing mixed-signal computing behavior across a large number of cells within a memory bank. This requirement falls beyond the scope of conventional design tools. In this paper, we bridge this gap by introducing the ReARTSim framework\u2014a GPU-accelerated mixed-signal transient simulator for analyzing ReRAM crossbar array. This tool facilitates the characterization of analog circuit and device behavior on a large scale, while also providing enhanced simulation performance for complex algorithm analysis, sign-off, and verification.<\/jats:p>","DOI":"10.1088\/2634-4386\/ad29fc","type":"journal-article","created":{"date-parts":[[2024,2,16]],"date-time":"2024-02-16T22:20:40Z","timestamp":1708122040000},"page":"014006","update-policy":"http:\/\/dx.doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["ReARTSim: an ReRAM ARray Transient Simulator with GPU optimized runtime acceleration"],"prefix":"10.1088","volume":"4","author":[{"ORCID":"http:\/\/orcid.org\/0000-0002-4440-0706","authenticated-orcid":true,"given":"Yu","family":"Sui","sequence":"first","affiliation":[]},{"given":"Tianhe","family":"Yu","sequence":"additional","affiliation":[]},{"given":"Shiming","family":"Song","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2024,3,1]]},"reference":[{"key":"ncead29fcbib1","doi-asserted-by":"publisher","first-page":"1420","DOI":"10.1109\/TED.2019.2961505","article-title":"ReRAM: history, status, and future","volume":"67","author":"Chen","year":"2020","journal-title":"IEEE Trans. Electron. Devices"},{"key":"ncead29fcbib2","article-title":"Panasonic partners with UMC on 40nm ReRAM","author":"Clarke","year":"2017"},{"key":"ncead29fcbib3","article-title":"Crossbar ReRAM in production at SMIC","author":"Clarke","year":"2017"},{"key":"ncead29fcbib4","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1038\/nnano.2009.456","article-title":"Atomic structure of conducting nanofilaments in TiO2 resistive switching memory","volume":"5","author":"Kwon","year":"2010","journal-title":"Nat. Nanotechnol."},{"key":"ncead29fcbib5","doi-asserted-by":"publisher","first-page":"1172","DOI":"10.1109\/TED.2012.2184545","article-title":"On the switching parameter variation of metal-oxide RRAM\u2014part I: physical modeling and simulation methodology","volume":"59","author":"Guan","year":"2012","journal-title":"IEEE Trans. Electron. Devices"},{"key":"ncead29fcbib6","doi-asserted-by":"publisher","first-page":"701","DOI":"10.1021\/acsaelm.9b00792","article-title":"Quantitative, dynamic TaOx memristor\/resistive random access memory model","volume":"2","author":"Lee","year":"2020","journal-title":"ACS Appl. Electron. Mater."},{"key":"ncead29fcbib7","doi-asserted-by":"publisher","first-page":"725","DOI":"10.3390\/mi13050725","article-title":"Conductive bridge random access memory (CBRAM): challenges and opportunities for memory and neuromorphic computing applications","volume":"13","author":"Abbas","year":"2022","journal-title":"Micromachines"},{"key":"ncead29fcbib8","first-page":"1","article-title":"Resistive memories for ultra-low-power embedded computing design","volume":"vol 6","author":"Vianello","year":"2014"},{"key":"ncead29fcbib9","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1145\/3007787.3001140","article-title":"Prime: a novel processing-in-memory architecture for neural network computation in reram-based main memory","volume":"44","author":"Chi","year":"2016","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ncead29fcbib10","first-page":"715","article-title":"PUMA: a programmable ultra-efficient memristor-based accelerator for machine learning inference","author":"Ankit","year":"2019"},{"key":"ncead29fcbib11","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1038\/s41586-022-04992-8","article-title":"A compute-in-memory chip based on resistive random-access memory","volume":"608","author":"Wan","year":"2022","journal-title":"Nature"},{"key":"ncead29fcbib12","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1109\/MNANO.2018.2844901","article-title":"Neuromorphic computing using memristor crossbar networks: a focus on bio-inspired approaches","volume":"12","author":"Jeong","year":"2018","journal-title":"IEEE Nanotechnol. Mag."},{"key":"ncead29fcbib13","doi-asserted-by":"publisher","first-page":"721","DOI":"10.1109\/TNANO.2017.2710158","article-title":"Temporal learning using second-order memristors","volume":"16","author":"Zidan","year":"2017","journal-title":"IEEE Trans. Nanotechnol."},{"key":"ncead29fcbib14","article-title":"Analog devices","author":"LTspice Simulator"},{"key":"ncead29fcbib15","article-title":"Ngspice","author":"Ngspice Simulator"},{"key":"ncead29fcbib16","article-title":"Synopsys","author":"PrimeSim HSPICE"},{"key":"ncead29fcbib17","author":"Nagel","year":"1975"},{"key":"ncead29fcbib18","first-page":"1","article-title":"Parallel circuit simulation using the direct method on a heterogeneous cloud","author":"Helal","year":"2015"},{"key":"ncead29fcbib19","article-title":"Sandia National Laboratories","author":"Xyce Simulator"},{"key":"ncead29fcbib20","first-page":"1","article-title":"TinySPICE: a parallel SPICE simulator on GPU for massively repeated small circuit simulations","author":"Han"},{"key":"ncead29fcbib21","doi-asserted-by":"publisher","DOI":"10.1016\/j.array.2021.100116","article-title":"Modeling and simulating in-memory memristive deep learning systems: an overview of current efforts","volume":"13","author":"Lammie","year":"2022","journal-title":"Array"},{"key":"ncead29fcbib22","doi-asserted-by":"publisher","first-page":"2306","DOI":"10.1109\/TCAD.2020.3043731","article-title":"DNN+ NeuroSim V2. 0: an end-to-end benchmarking framework for compute-in-memory accelerators for on-chip training","volume":"40","author":"Peng","year":"2020","journal-title":"IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst."},{"key":"ncead29fcbib23","first-page":"1","article-title":"A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays","author":"Rasch"},{"key":"ncead29fcbib24","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1016\/j.neucom.2022.02.043","article-title":"MemTorch: an open-source simulation framework for memristive deep learning systems","volume":"485","author":"Lammie","year":"2022","journal-title":"Neurocomputing"},{"key":"ncead29fcbib25","article-title":"PyTorch","author":"PyTorch"},{"key":"ncead29fcbib26","article-title":"NVIDIA","author":"CUDA Toolkit"},{"key":"ncead29fcbib27","doi-asserted-by":"publisher","first-page":"1991","DOI":"10.1109\/JPROC.2012.2188770","article-title":"Memristive device fundamentals and modeling: applications to circuits and systems simulation","volume":"100","author":"Eshraghian","year":"2012","journal-title":"Proc. IEEE"},{"key":"ncead29fcbib28","doi-asserted-by":"publisher","first-page":"1436","DOI":"10.1109\/LED.2011.2163292","article-title":"A memristor device model","volume":"32","author":"Yakopcic","year":"2011","journal-title":"IEEE Electron. Device Lett."},{"key":"ncead29fcbib29","doi-asserted-by":"publisher","first-page":"429","DOI":"10.1038\/nnano.2008.160","article-title":"Memristive switching mechanism for metal\/oxide\/metal nanodevices","volume":"3","author":"Yang","year":"2008","journal-title":"Nat. Nanotechnol."},{"key":"ncead29fcbib30","doi-asserted-by":"publisher","first-page":"661","DOI":"10.1088\/0143-0807\/30\/4\/001","article-title":"The elusive memristor: properties of basic electrical circuits","volume":"30","author":"Joglekar","year":"2009","journal-title":"Eur. J. Phys."},{"key":"ncead29fcbib31","first-page":"210","article-title":"SPICE model of memristor with nonlinear dopant drift","volume":"18","author":"Biolek","year":"2009","journal-title":"Radioengineering"},{"key":"ncead29fcbib32","doi-asserted-by":"publisher","first-page":"786","DOI":"10.1109\/TCSII.2015.2433536","article-title":"VTEAM: a general model for voltage-controlled memristors","volume":"62","author":"Kvatinsky","year":"2015","journal-title":"IEEE Trans. Circuits Syst. II"},{"key":"ncead29fcbib33","author":"Bradie","year":"2006"},{"key":"ncead29fcbib34","first-page":"MY.7.1","article-title":"Variability of resistive switching memories and its impact on crossbar array performance","author":"Chen"},{"key":"ncead29fcbib35","doi-asserted-by":"publisher","first-page":"1183","DOI":"10.1109\/TED.2012.2184544","article-title":"On the switching parameter variation of metal oxide RRAM\u2014part II: model corroboration and device design strategy","volume":"59","author":"Yu","year":"2012","journal-title":"IEEE Trans. Electron. Devices"},{"key":"ncead29fcbib36","first-page":"877","article-title":"Impact of process variations on emerging memristor","author":"Niu"},{"key":"ncead29fcbib37","first-page":"2592","article-title":"Memristive devices for stochastic computing","author":"Gaba"},{"key":"ncead29fcbib38","first-page":"80","article-title":"BSB training scheme implementation on memristor-based circuit","author":"Hu"},{"key":"ncead29fcbib39","first-page":"1","article-title":"Vortex: variation-aware training for memristor X-bar","author":"Liu","year":"2015"},{"key":"ncead29fcbib40","doi-asserted-by":"publisher","first-page":"2213","DOI":"10.1109\/TED.2020.2979606","article-title":"A parallel multibit programing scheme with high precision for RRAM-based neuromorphic systems","volume":"67","author":"Chen","year":"2020","journal-title":"IEEE Trans. Electron. Devices"},{"key":"ncead29fcbib41","first-page":"1","article-title":"Improving noise tolerance of mixed-signal neural networks","author":"Klachko"},{"key":"ncead29fcbib42","article-title":"NVIDIA","author":"CUDA C++ programming guide","year":"2023"},{"key":"ncead29fcbib43","article-title":"NVIDIA","author":"NVIDIA tesla P100"},{"key":"ncead29fcbib44","article-title":"NVIDIA","author":"NVIDIA H100 tensor core GPU"},{"key":"ncead29fcbib45","article-title":"Benchmarking TPU, GPU, and CPU platforms for deep learning","author":"Wang","year":"2019"},{"key":"ncead29fcbib46","first-page":"382","article-title":"A quantitative performance analysis model for GPU architectures","author":"Zhang"},{"key":"ncead29fcbib47","first-page":"72","article-title":"Communication optimization on GPU: a case study of sequence alignment algorithms","author":"Wang"},{"key":"ncead29fcbib48","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ncead29fcbib49","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ncead29fcbib50","doi-asserted-by":"publisher","DOI":"10.3389\/fncom.2021.674154","article-title":"Accelerating inference of convolutional neural networks using in-memory computing","volume":"15","author":"Dazzi","year":"2021","journal-title":"Front. Comput. Neurosci."},{"key":"ncead29fcbib51","author":"Bishop","year":"2006"},{"key":"ncead29fcbib52","doi-asserted-by":"publisher","first-page":"459","DOI":"10.1016\/0893-6080(89)90044-0","article-title":"Optimal unsupervised learning in a single-layer linear feedforward neural network","volume":"2","author":"Sanger","year":"1989","journal-title":"Neural Netw."},{"key":"ncead29fcbib53","doi-asserted-by":"publisher","first-page":"3113","DOI":"10.1021\/acs.nanolett.7b00552","article-title":"Experimental demonstration of feature extraction and dimensionality reduction using memristor networks","volume":"17","author":"Choi","year":"2017","journal-title":"Nano Lett."},{"key":"ncead29fcbib54","author":"Lam","year":"2005"},{"key":"ncead29fcbib55","article-title":"MathWorks","author":"MATLAB"}],"container-title":["Neuromorphic Computing and Engineering"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,1]],"date-time":"2024-03-01T11:45:03Z","timestamp":1709293503000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2634-4386\/ad29fc"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,1]]},"references-count":55,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,3,1]]},"published-print":{"date-parts":[[2024,3,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2634-4386\/ad29fc","relation":{},"ISSN":["2634-4386"],"issn-type":[{"value":"2634-4386","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,1]]},"assertion":[{"value":"ReARTSim: an ReRAM ARray Transient Simulator with GPU optimized runtime acceleration","name":"article_title","label":"Article Title"},{"value":"Neuromorphic Computing and Engineering","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-09-16","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-02-16","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-03-01","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}