Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

doi:10.1038/s41467-018-04484-2

. 2018 Jun 19;9(1):2385.

doi: 10.1038/s41467-018-04484-2.

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Can Li¹, Daniel Belkin^{1

2}, Yunning Li¹, Peng Yan^{1

3}, Miao Hu^{4

5}, Ning Ge⁶, Hao Jiang¹, Eric Montgomery⁴, Peng Lin¹, Zhongrui Wang¹, Wenhao Song¹, John Paul Strachan⁴, Mark Barnell⁷, Qing Wu⁷, R Stanley Williams⁴, J Joshua Yang⁸, Qiangfei Xia⁹

Affiliations

¹ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA.
² Swarthmore College, Swarthmore, PA, 19081, USA.
³ Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China.
⁴ Hewlett Packard Labs, Hewlett Packard Enterprise, Palo Alto, CA, 94304, USA.
⁵ Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY, 13902, USA.
⁶ HP Labs, HP Inc., Palo Alto, CA, 94304, USA.
⁷ Air Force Research Laboratory, Information Directorate, Rome, NY, 13441, USA.
⁸ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA. jjyang@umass.edu.
⁹ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA. qxia@umass.edu.

PMID: 29921923
PMCID: PMC6008303
DOI: 10.1038/s41467-018-04484-2

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Can Li et al. Nat Commun. 2018.

. 2018 Jun 19;9(1):2385.

doi: 10.1038/s41467-018-04484-2.

Authors

Affiliations

¹ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA.
² Swarthmore College, Swarthmore, PA, 19081, USA.
³ Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China.
⁴ Hewlett Packard Labs, Hewlett Packard Enterprise, Palo Alto, CA, 94304, USA.
⁵ Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY, 13902, USA.
⁶ HP Labs, HP Inc., Palo Alto, CA, 94304, USA.
⁷ Air Force Research Laboratory, Information Directorate, Rome, NY, 13441, USA.
⁸ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA. jjyang@umass.edu.
⁹ Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA, 01003, USA. qxia@umass.edu.

PMID: 29921923
PMCID: PMC6008303
DOI: 10.1038/s41467-018-04484-2

Abstract

Memristors with tunable resistance states are emerging building blocks of artificial neural networks. However, in situ learning on a large-scale multiple-layer memristor network has yet to be demonstrated because of challenges in device property engineering and circuit integration. Here we monolithically integrate hafnium oxide-based memristors with a foundry-made transistor array into a multiple-layer neural network. We experimentally demonstrate in situ learning capability and achieve competitive classification accuracy on a standard machine learning dataset, which further confirms that the training algorithm allows the network to adapt to hardware imperfections. Our simulation using the experimental parameters suggests that a larger network would further increase the classification accuracy. The memristor neural network is a promising hardware platform for artificial intelligence with high speed-energy efficiency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1**
Memristive platform for in situ learning. a An optical image of a wafer with transistor arrays. b Close-up of chip image showing arrays of various sizes. c Microscope image showing the 1T1R (one transistor one memristor) structure of the cell. Scale bar, 10 µm. d Cross-sectional scanning electron microscopic image of an individual 1T1R cell, which is cut in a focused ion beam microscope from the dashed line in c. Scale bar, 2 µm. e Cross-sectional transmission electron microscopic image of the integrated Ta/HfO₂/Pt memristor. Scale bar, 2 nm. f All responsive devices over 20 potentiation/depression epochs of 200 pulses each. g Evolution of conductance during 20 cycles of full potentiation and depression for a single cell with 200 pulses per cycle, showing low cycle-to-cycle variability. More results are shown in Supplementary Fig. 1. h Evolution of conductance over one 200-pulse cycle of full potentiation and depression for all responsive devices in the array, with median conductance indicated by the yellow line. i Conductance of a 128 × 64 array after single-pulse conductance writing of the discrete Fourier transform matrix. Several stuck devices are visible (in yellow)

**Fig. 2**
In situ training algorithm. a Schematic diagram of a two-layer neural network. Each neuron computes a weighted sum of its inputs and applies a nonlinear activation function. b The implementation of the network with a set of memristor crossbars. Each synaptic weight (arrows in a) corresponds to the conductance difference between two memristors (as illustrated by the orange columns). Each crossbar computes weighted sums of its input voltages. Between the crossbars is a layer of circuits that read the current from each wire, convert it to a voltage, and apply the activation function. The activation function was implemented in software in this work. c Flow chart of the in situ training. Steps in green boxes were implemented in hardware in this work, while those in yellow boxes were computationally expensive steps that can be accomplished with circuits integrated onto the chip in the future. The algorithm is described in detail in Methods

**Fig. 3**
In situ online training and inference experiments on Modified National Institute of Standards and Technology (MNIST) handwritten digit recognition. a Typical handwritten digits from the MNIST database. b Photo of the integrated 128 × 64 array during measurement. The array was partitioned into two parts for the first and second layers, respectively. In all, 54 hidden neurons were used, so the first layer weight matrix is 64 × 54 (implemented using 6912 memristors) and the second layer matrix is 54 × 10 (implemented using 1080 memristors). The blue and green false-colored areas are the positive and negative parts of the differential pairs. c Minibatch accuracy increases over the course of training. Experimental data followed the defect-free simulation closely, with a consistent 2–4% gap. d The conductance-gate voltage relation extracted from data collected during training. The conductance was read using the scheme described in the Methods. The conductance includes the effects of sneak-paths and wire resistance, which makes the measured values smaller and the variance larger than those in Fig. 1b, c. The dashed line indicates the mean conductance, while the error bars show a 95% confidence interval for the measured conductance. The real-time online training accuracy with the readout weight values is shown in an animation in Supplementary Movie 1. e–g Typical correctly classified digit “9” and h–j misclassified digit “8” after the in situ training. e, h Images of the actual digits from the MNIST test set used as the input to the network. f, i The raw current measured from the output layer neurons. The neuron representing the digit “9” has the highest output current, indicating a correct classification. g, j The corresponding Bayesian probability of each digit, as calculated by a softmax function. More inference samples are shown in Supplementary Fig. 7 and Supplementary Movies 2 and 3

**Fig. 4**
Analysis based on experimental-calibrated simulation. a The experimental classification error (accuracy shown in Supplementary Fig. 10) matches the simulated accuracy. The simulation considers experiment parameters, including 11% devices stuck at 10 μS, 2% conductance update variation, limited conductance dynamic range, etc. The simulation on defect-free assumption shows an accuracy approaching that from TensorFlow. Each data point is the classification error rate on the complete testing set (10 000 images) after 500 images (simulation or TensorFlow) or 5000 images (experiment). b The impact of non-responsive devices on the inference accuracy with in situ and ex situ training approach. The non-responsive device was stuck in a very low-conductance state (10 µS), which is the typical defect device value observed in the experiment. The result shows that the in situ training process adapts to the defects, providing a much higher defect tolerance compared with pre-loading ex situ training weights into the network. With 50% stuck OFF devices, the network can still achieve over 60% accuracy. The error bar shows the s.d. over 10 simulations. c The multilayer network also helps with the defect tolerance. If one device is stuck, the associated hidden neuron will adjust the connections accordingly. The error bar shows the s.d. over 10 simulations. d The simulation of a larger network constructed on a larger memristor crossbar (1024 × 512) with experimental parameters (e.g., 11% defect rate) could achieve accuracy above 97%, which suggests a large memristor network could narrow the accuracy performance gap from the conventional CMOS hardware. The network architecture is shown in Supplementary Fig. 14

See this image and copyright information in PMC

Cited by

An in-memory computing architecture based on two-dimensional semiconductors for multiply-accumulate operations.
Wang Y, Tang H, Xie Y, Chen X, Ma S, Sun Z, Sun Q, Chen L, Zhu H, Wan J, Xu Z, Zhang DW, Zhou P, Bao W. Wang Y, et al. Nat Commun. 2021 Jun 7;12(1):3347. doi: 10.1038/s41467-021-23719-3. Nat Commun. 2021. PMID: 34099710 Free PMC article.
Implementation of Dropout Neuronal Units Based on Stochastic Memristive Devices in Neural Networks with High Classification Accuracy.
Huang HM, Xiao Y, Yang R, Yu YT, He HK, Wang Z, Guo X. Huang HM, et al. Adv Sci (Weinh). 2020 Jul 26;7(18):2001842. doi: 10.1002/advs.202001842. eCollection 2020 Sep. Adv Sci (Weinh). 2020. PMID: 32999852 Free PMC article.
Proposition of Adaptive Read Bias: A Solution to Overcome Power and Scaling Limitations in Ferroelectric-Based Neuromorphic System.
Koo RH, Shin W, Kim S, Im J, Park SH, Ko JH, Kwon D, Kim JJ, Kwon D, Lee JH. Koo RH, et al. Adv Sci (Weinh). 2024 Feb;11(5):e2303735. doi: 10.1002/advs.202303735. Epub 2023 Dec 1. Adv Sci (Weinh). 2024. PMID: 38039488 Free PMC article.
Versatile stochastic dot product circuits based on nonvolatile memories for high performance neurocomputing and neurooptimization.
Mahmoodi MR, Prezioso M, Strukov DB. Mahmoodi MR, et al. Nat Commun. 2019 Nov 8;10(1):5113. doi: 10.1038/s41467-019-13103-7. Nat Commun. 2019. PMID: 31704925 Free PMC article.
Metal-Organic Frameworks-Based Memristors: Materials, Devices, and Applications.
Shu F, Chen X, Yu Z, Gao P, Liu G. Shu F, et al. Molecules. 2022 Dec 14;27(24):8888. doi: 10.3390/molecules27248888. Molecules. 2022. PMID: 36558025 Free PMC article. Review.

See all "Cited by" articles

References

1. Merolla PA, et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science. 2014;345:668–673. doi: 10.1126/science.1254642. - DOI - PubMed
1. Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 44th International Symposium on Computer Architecture (ISCA) 1–12 (ACM, Toronto, ON, Canada, 2017).
1. Chen, Y. et al. DaDianNao: A machine-learning supercomputer. In 47th Annual IEEE/ACM International Symposium on Microarchitecture 609–622 (IEEE, Cambridge, UK, 2014).
1. Indiveri G, Linares-Barranco B, Legenstein R, Deligeorgis G, Prodromakis T. Integration of nanoscale memristor synapses in neuromorphic computing architectures. Nanotechnology. 2013;24:384010. doi: 10.1088/0957-4484/24/38/384010. - DOI - PubMed
1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed

Publication types

Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

[1] Merolla PA, et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science. 2014;345:668–673. doi: 10.1126/science.1254642. - DOI - PubMed

[2] Merolla PA, et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science. 2014;345:668–673. doi: 10.1126/science.1254642. - DOI - PubMed

[3] Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 44th International Symposium on Computer Architecture (ISCA) 1–12 (ACM, Toronto, ON, Canada, 2017).

[4] Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 44th International Symposium on Computer Architecture (ISCA) 1–12 (ACM, Toronto, ON, Canada, 2017).

[5] Chen, Y. et al. DaDianNao: A machine-learning supercomputer. In 47th Annual IEEE/ACM International Symposium on Microarchitecture 609–622 (IEEE, Cambridge, UK, 2014).

[6] Chen, Y. et al. DaDianNao: A machine-learning supercomputer. In 47th Annual IEEE/ACM International Symposium on Microarchitecture 609–622 (IEEE, Cambridge, UK, 2014).

[7] Indiveri G, Linares-Barranco B, Legenstein R, Deligeorgis G, Prodromakis T. Integration of nanoscale memristor synapses in neuromorphic computing architectures. Nanotechnology. 2013;24:384010. doi: 10.1088/0957-4484/24/38/384010. - DOI - PubMed

[8] Indiveri G, Linares-Barranco B, Legenstein R, Deligeorgis G, Prodromakis T. Integration of nanoscale memristor synapses in neuromorphic computing architectures. Nanotechnology. 2013;24:384010. doi: 10.1088/0957-4484/24/38/384010. - DOI - PubMed

[9] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed

[10] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Affiliations

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources