Design of an Always-On Deep Neural Network-Based 1- $\mu$ W Voice Activity Detector Aided With a Customized Software Model for Analog Feature Extraction

Minhao Yang,Chung-Heng Yeh,Yiyin Zhou,Joao Pedro Cerqueira,Aurel A. Lazar,Mingoo Seok

IEEE Journal of Solid-State Circuits（2019）

引用 48|浏览56

暂无评分

摘要

This paper presents an ultra-low-power voice activity detector (VAD). It uses analog signal processing for acoustic feature extraction (AFE) directly on the microphone output, approximate event-driven analog-to-digital conversion (ED-ADC), and digital deep neural network (DNN) for speech/non-speech classification. New circuits, including the low-noise amplifier, bandpass filter, and full-wave rectifier contribute to the more than 9

$\times $

normalized power/channel reduction in the feature extraction front-end compared to the best prior art. The digital DNN is a three-hidden-layer binarized multilayer perceptron (MLP) with a 2-neuron output layer and a 48-neuron input layer that receives parallel event streams from the ED-ADCs. To obtain the DNN weights via off-line training, a customized front-end model written in python is constructed to accelerate feature generation in software emulation, and the model parameters are extracted from Spectre simulations. The chip, fabricated in 0.18-

$\mu \text{m}$

CMOS, has a core area of 1.66

$\times $

1.52 mm ² and consumes 1

$\mu \text{W}$

. The classification measurements using the 1-hour 10-dB signal-to-noise ratio audio with restaurant background noise show a mean speech/non-speech hit rate of 84.4%/85.4% with a 1.88%/4.65% 1-

$\sigma $

variation across ten dies that are all loaded with the same weights.

查看译文

关键词

Feature extraction,Training,Band-pass filters,Computational modeling,Software,Acoustics,Detectors

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要