Chrome Extension
WeChat Mini Program
Use on ChatGLM

IIRI-Net: an Interpretable Convolutional Front-End Inspired by IIR Filters for Speaker Identification.

Neurocomputing(2023)

Cited 0|Views7
No score
Abstract
Learning interpretable filters in Convolutional Neural Networks (CNNs) is an approach that helps to build models with better generalization ability. Interpretable filters can reveal some hidden aspects of the task and help to improve the model. One of the most successful approaches in the field of the speech processing is SincNet, where the model learns some band-pass filters in the first layer of a CNN with a raw waveform as its input. In this paper, similar to SincNet, some meaningful filters are proposed, which here are inspired by Infinite Impulse Response (IIR) filters. The proposed model uses a phase correction process to ensure that phase linearity is satisfied. The effective length of the truncated IIR filter is calculated based on the accumulated energy, and the effect of changing the filter size on the final results has been investigated. The proposed model is evaluated in the speaker identification task on the TIMIT and Librispeech datasets and compared with traditional CNNs and four interpretable kernel-based models. The experimental results show the superiority of the proposed model both in performance and convergence speed. Moreover, some patterns of the speech signal, which lead to uniquely identifying a speaker, are analyzed by examining the spectrum of the learned filters.
More
Translated text
Key words
eXplainable AI,Auditory Filter Models,IIR filter,SincNet,Speaker Identification,Deep Learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined