High-Fidelity Diffusion-Based Audio Codec
2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC)(2024)
摘要
The compression of audio signals plays a crucial role in audio storage and transmission, particularly within the context of streaming media applications, where bandwidth utilization is a dominant factor of cost. Motivated by this challenge, our objective is to continuously enhance the compression rate while simultaneously ensuring the retention of audio quality. In this paper, we present a diffusion-based codec sDiff-Codec, which is a state-of-the-art, high-fidelity neural audio codec. The condition module and generator module serve the role of encoder and decoder in sDiff-Codec, the sound quality enhancement task becomes audio compression task. Additionally, we employed a hybrid quantizer to quantize the latent information using a hyper-prior model, the hyper-prior model is to generate prior auxiliary information of the entropy model. The experiment results show that sDiff-Codec is superior compared with the baseline methods under scenarios when monophonic audio signal bitrate ranges from 16 kbps to 192 kbps.
更多查看译文
关键词
Neural audio codec,score-based diffusion,hybrid quantizer,condition network,generator network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要