Reconfigurable Event-Driven Spiking Neuromorphic Computing near High-Bandwidth Memory.

Gopabandhu Hota, Gwenevere Frank, Keli Wang,Abhinav Uppal, Omowuyi Olajide, Jeffrey C. Liu, Shashank Bansal,Kenneth Yoshimoto, Qingbo Wang,Stephen R. Deiss,Gert Cauwenberghs

2023 IEEE Biomedical Circuits and Systems Conference (BioCAS)（2023）

引用 0|浏览2

暂无评分

摘要

This paper presents the hardware architecture and processing dataflow of a general-purpose reconfigurable, spike event-driven computing platform for running bio-inspired neural networks in real-time. Near-memory computing architecture and hardware-aware compiler optimization facilitate low-latency, distributed parallel execution of user-specified network models of arbitrary connection topology. The massively parallel processing architecture scales to implementation of very large networks through hierarchical address-event routing (HiAER) of spikes implementing long-range synaptic connectivity across the network. Large-size bio-inspired AI model capacity and high-performance neuronal network emulation in parallel hardware are enabled by virtue of memory-efficient network storage and execution dataflow optimized for sparse networks. Implemented on a high-end Field Programmable Gate Array (FPGA), the HiAER-Spike platform captures axonal and neuronal events in internal FPGA memories which, when activated, fetch synaptic connectivity from lookup tables stored in High Bandwidth Memory (HBM) 2.0 to update the neuron membrane state variables. Threshold crossing events trigger output spikes which in turn propagate as axonal activity through the network. The system streams input and output spike events over a PCIe 3.0 interface with the host station. A compiler toolchain generates the necessary FPGA and HBM hardware configuration for partitioning and mapping of the network topology and synaptic connectivity onto the HBM data structure. The memory organization and microarchitecture of the FPGA implementation supports 4M axons, 4M neurons, and 1B synapses, and leverages the nominal 450GBps throughput of HBM 2.0 for a throughput of 0.1 tera synaptic operations per second (TSynOps). The architecture efficiently handles both sparse connectivity and sparse activity for robust and low-latency event-driven inference for both edge and cloud computing, with examples demonstrating spike-based visual inference.

查看译文

关键词

Spiking neural networks,neuromorphic systems engineering,reconfigurable hardware,distributed computing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要