Related papers
Addressing the Hardware Resource Requirements of Network-On-Chip Based Neural Architectures
Jim Harkin
Proceedings of the International Conference on Neural Computation Theory and Applications
View PDFchevron_right
A Low-Power RRAM Memory Block for Embedded, Multi-Level Weight and Bias Storage in Artificial Neural Networks
Marco Breiling
Micromachines, 2021
Pattern recognition as a computing task is very well suited for machine learning algorithms utilizing artificial neural networks (ANNs). Computing systems using ANNs usually require some sort of data storage to store the weights and bias values for the processing elements of the individual neurons. This paper introduces a memory block using resistive memory cells (RRAM) to realize this weight and bias storage in an embedded and distributed way while also offering programming and multi-level ability. By implementing power gating, overall power consumption is decreased significantly without data loss by taking advantage of the non-volatility of the RRAM technology. Due to the versatility of the peripheral circuitry, the presented memory concept can be adapted to different applications and RRAM technologies.
View PDFchevron_right
Accuracy and Resiliency of Analog Compute-in-Memory Inference Engines
SUBRAMANIAN IYER
2020
Recently, analog compute-in-memory (CIM) architectures based on emerging analog non-volatile memory (NVM) technologies have been explored for deep neural networks (DNN) to improve energy efficiency. Such architectures, however, leverage charge conservation, an operation with infinite resolution, and thus are susceptible to errors. The computations in DNN realized by analog NVM thus have high uncertainty due to the device stochasticity. Several reports have demonstrated the use of analog NVM for CIM in a limited scale. It is unclear whether the uncertainties in computations will prohibit large-scale DNNs. To explore this critical issue of scalability, this paper first presents a simulation framework to evaluate the feasibility of large-scale DNNs based on CIM architecture and analog NVM. Simulation results show that DNNs trained for high-precision digital computing engines are not resilient against the uncertainty of the analog NVM devices. To avoid such catastrophic failures, this p...
View PDFchevron_right
A 0.32–128 TOPS, Scalable Multi-Chip-Module-Based Deep Neural Network Inference Accelerator With Ground-Referenced Signaling in 16 nm
Joel Emer
IEEE Journal of Solid-State Circuits, 2020
View PDFchevron_right
Fully On-Chip MAC at 14 nm Enabled by Accurate Row-Wise Programming of PCM-Based Weights and Parallel Vector-Transport in Duration-Format
Alexander Friz
IEEE Transactions on Electron Devices, 2021
View PDFchevron_right
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference
Kaoutar El Maghraoui
arXiv (Cornell University), 2023
View PDFchevron_right
A Multilayer Neural Accelerator With Binary Activations Based on Phase-Change Memory
Andrea Bonfanti
IEEE Transactions on Electron Devices
View PDFchevron_right
Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks
Sumit K Mandal
ACM Journal on Emerging Technologies in Computing Systems, 2022
With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC ...
View PDFchevron_right
In-memory Implementation of On-chip Trainable and Scalable ANN for AI/ML Applications
Abhash Kumar
2020
Traditional von Neumann architecture based processors become inefficient in terms of energy and throughput as they involve separate processing and memory units, also known as~\textit{memory wall}. The memory wall problem is further exacerbated when massive parallelism and frequent data movement are required between processing and memory units for real-time implementation of artificial neural network (ANN) that enables many intelligent applications. One of the most promising approach to address the memory wall problem is to carry out computations inside the memory core itself that enhances the memory bandwidth and energy efficiency for extensive computations. This paper presents an in-memory computing architecture for ANN enabling artificial intelligence (AI) and machine learning (ML) applications. The proposed architecture utilizes deep in-memory architecture based on standard six transistor (6T) static random access memory (SRAM) core for the implementation of a multi-layered perce...
View PDFchevron_right
Gradient descent-based programming of analog in-memory computing cores
Kevin Brew
2022 International Electron Devices Meeting (IEDM)
View PDFchevron_right