Efficient and self-adaptive in-situ learning in multilayer memristor neural networks - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jun 19;9(1):2385.
doi: 10.1038/s41467-018-04484-2.

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Affiliations

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

Can Li et al. Nat Commun. .

Abstract

Memristors with tunable resistance states are emerging building blocks of artificial neural networks. However, in situ learning on a large-scale multiple-layer memristor network has yet to be demonstrated because of challenges in device property engineering and circuit integration. Here we monolithically integrate hafnium oxide-based memristors with a foundry-made transistor array into a multiple-layer neural network. We experimentally demonstrate in situ learning capability and achieve competitive classification accuracy on a standard machine learning dataset, which further confirms that the training algorithm allows the network to adapt to hardware imperfections. Our simulation using the experimental parameters suggests that a larger network would further increase the classification accuracy. The memristor neural network is a promising hardware platform for artificial intelligence with high speed-energy efficiency.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Memristive platform for in situ learning. a An optical image of a wafer with transistor arrays. b Close-up of chip image showing arrays of various sizes. c Microscope image showing the 1T1R (one transistor one memristor) structure of the cell. Scale bar, 10 µm. d Cross-sectional scanning electron microscopic image of an individual 1T1R cell, which is cut in a focused ion beam microscope from the dashed line in c. Scale bar, 2 µm. e Cross-sectional transmission electron microscopic image of the integrated Ta/HfO2/Pt memristor. Scale bar, 2 nm. f All responsive devices over 20 potentiation/depression epochs of 200 pulses each. g Evolution of conductance during 20 cycles of full potentiation and depression for a single cell with 200 pulses per cycle, showing low cycle-to-cycle variability. More results are shown in Supplementary Fig. 1. h Evolution of conductance over one 200-pulse cycle of full potentiation and depression for all responsive devices in the array, with median conductance indicated by the yellow line. i Conductance of a 128 × 64 array after single-pulse conductance writing of the discrete Fourier transform matrix. Several stuck devices are visible (in yellow)
Fig. 2
Fig. 2
In situ training algorithm. a Schematic diagram of a two-layer neural network. Each neuron computes a weighted sum of its inputs and applies a nonlinear activation function. b The implementation of the network with a set of memristor crossbars. Each synaptic weight (arrows in a) corresponds to the conductance difference between two memristors (as illustrated by the orange columns). Each crossbar computes weighted sums of its input voltages. Between the crossbars is a layer of circuits that read the current from each wire, convert it to a voltage, and apply the activation function. The activation function was implemented in software in this work. c Flow chart of the in situ training. Steps in green boxes were implemented in hardware in this work, while those in yellow boxes were computationally expensive steps that can be accomplished with circuits integrated onto the chip in the future. The algorithm is described in detail in Methods
Fig. 3
Fig. 3
In situ online training and inference experiments on Modified National Institute of Standards and Technology (MNIST) handwritten digit recognition. a Typical handwritten digits from the MNIST database. b Photo of the integrated 128 × 64 array during measurement. The array was partitioned into two parts for the first and second layers, respectively. In all, 54 hidden neurons were used, so the first layer weight matrix is 64 × 54 (implemented using 6912 memristors) and the second layer matrix is 54 × 10 (implemented using 1080 memristors). The blue and green false-colored areas are the positive and negative parts of the differential pairs. c Minibatch accuracy increases over the course of training. Experimental data followed the defect-free simulation closely, with a consistent 2–4% gap. d The conductance-gate voltage relation extracted from data collected during training. The conductance was read using the scheme described in the Methods. The conductance includes the effects of sneak-paths and wire resistance, which makes the measured values smaller and the variance larger than those in Fig. 1b, c. The dashed line indicates the mean conductance, while the error bars show a 95% confidence interval for the measured conductance. The real-time online training accuracy with the readout weight values is shown in an animation in Supplementary Movie 1. eg Typical correctly classified digit “9” and hj misclassified digit “8” after the in situ training. e, h Images of the actual digits from the MNIST test set used as the input to the network. f, i The raw current measured from the output layer neurons. The neuron representing the digit “9” has the highest output current, indicating a correct classification. g, j The corresponding Bayesian probability of each digit, as calculated by a softmax function. More inference samples are shown in Supplementary Fig. 7 and Supplementary Movies 2 and 3
Fig. 4
Fig. 4
Analysis based on experimental-calibrated simulation. a The experimental classification error (accuracy shown in Supplementary Fig. 10) matches the simulated accuracy. The simulation considers experiment parameters, including 11% devices stuck at 10 μS, 2% conductance update variation, limited conductance dynamic range, etc. The simulation on defect-free assumption shows an accuracy approaching that from TensorFlow. Each data point is the classification error rate on the complete testing set (10 000 images) after 500 images (simulation or TensorFlow) or 5000 images (experiment). b The impact of non-responsive devices on the inference accuracy with in situ and ex situ training approach. The non-responsive device was stuck in a very low-conductance state (10 µS), which is the typical defect device value observed in the experiment. The result shows that the in situ training process adapts to the defects, providing a much higher defect tolerance compared with pre-loading ex situ training weights into the network. With 50% stuck OFF devices, the network can still achieve over 60% accuracy. The error bar shows the s.d. over 10 simulations. c The multilayer network also helps with the defect tolerance. If one device is stuck, the associated hidden neuron will adjust the connections accordingly. The error bar shows the s.d. over 10 simulations. d The simulation of a larger network constructed on a larger memristor crossbar (1024 × 512) with experimental parameters (e.g., 11% defect rate) could achieve accuracy above 97%, which suggests a large memristor network could narrow the accuracy performance gap from the conventional CMOS hardware. The network architecture is shown in Supplementary Fig. 14

Similar articles

Cited by

References

    1. Merolla PA, et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science. 2014;345:668–673. doi: 10.1126/science.1254642. - DOI - PubMed
    1. Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 44th International Symposium on Computer Architecture (ISCA) 1–12 (ACM, Toronto, ON, Canada, 2017).
    1. Chen, Y. et al. DaDianNao: A machine-learning supercomputer. In 47th Annual IEEE/ACM International Symposium on Microarchitecture 609–622 (IEEE, Cambridge, UK, 2014).
    1. Indiveri G, Linares-Barranco B, Legenstein R, Deligeorgis G, Prodromakis T. Integration of nanoscale memristor synapses in neuromorphic computing architectures. Nanotechnology. 2013;24:384010. doi: 10.1088/0957-4484/24/38/384010. - DOI - PubMed
    1. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. - DOI - PubMed

Publication types