Chrome Extension
WeChat Mini Program
Use on ChatGLM

2-Bit Conformer Quantization for Automatic Speech Recognition

INTERSPEECH 2023(2023)

Google Res

Cited 0|Views68
Abstract
Large speech models are rapidly gaining traction in research community. As a result, model compression has become an important topic, so that these models can fit in memory and be served with reduced cost. Practical approaches for compressing automatic speech recognition (ASR) model use int8 or int4 weight quantization. In this study, we propose to develop 2-bit ASR models. We explore the impact of symmetric and asymmetric quantization combined with sub-channel quantization and clipping on both LibriSpeech dataset and large-scale training data. We obtain a lossless 2-bit Conformer model with 32% model size reduction when compared to state of the art 4-bit Conformer model for LibriSpeech. With the large-scale training data, we obtain a 2-bit Conformer model with over 40% model size reduction against the 4-bit version at the cost of 17% relative word error rate degradation
More
Translated text
Key words
speech recognition,model quantization,low-bit quantization
PDF
Bibtex
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:本研究提出了一种2-bit量化的Conformer模型,用于自动语音识别,实现了在模型大小显著减小的情况下保持性能,创新点在于首次将2-bit量化应用于ASR模型,并探索了对称与非对称量化结合子通道量化和剪辑的影响。

方法】:研究采用了对称和非对称量化策略,结合子通道量化和剪辑技术,对Conformer模型进行2-bit量化。

实验】:在LibriSpeech数据集和大规模训练数据上进行了实验,最终得到一个无损的2-bit Conformer模型,相较于最先进的4-bit Conformer模型,模型大小减少了32%;在大规模训练数据上,2-bit Conformer模型相较于4-bit版本模型大小减少了40%,但相对单词错误率增加了17%。