Friday, May 15, 2020

aCortex: a Multi-Purpose Mixed-Signal Neural Inference Accelerator Based on Non-Volatile Memory Devices

(Mohammad Bavandpour, UCSB, presenting on May 20, 2020 at 11:00AM & 7:00PM ET)

We introduce “aCortex”, an extremely energy efficient, fast, compact, and versatile neuromorphic processor architecture suitable for acceleration of a wide range of neural network inference models. The most important feature of our processor is a configurable mixed-signal computing array of vector-by-matrix multiplier (VMM) blocks utilizing embedded nonvolatile memory (NVM) arrays for storing weight matrices. In this architecture, power-hungry analog peripheral circuitry for data integration and conversion is shared among a very large array of VMM blocks enabling efficient instant analog-domain VMM operation for different neural layer types with a wide range of layer specifications. This approach also maximizes the processor’s area efficiency through sharing the area-hungry high-voltage programming switching circuitry as well as the analog peripheries among a large 2D array of NVM blocks. Such compact implementation further boosts the energy efficiency via lowering the digital data transfer cost. Other unique features of aCortex include configurable chain of buffers and data buses, a simple and efficient Instruction Set Architecture (ISA) and its corresponding multi-agent controller, and a customized refresh-free embedded DRAM memory. In this work, we specifically focus on 55-nm 2D-NOR and 3D-NAND flash memory technologies, and present detailed system-level area/energy/speed estimations targeting several common benchmarks, namely Inception-v1 and ResNet-152, two state-of-the-art deep feedforward networks for image classification, and GNTM, a Google’s deep recurrent network for language translation.