High-Fidelity Diffusion-Based Audio Codec

Zhengpu Zhang, Jianyuan Feng, Yongjian Mao,Yehang Zhu,Junjie Shi, Xuzhou Ye,Shilei Liu, Derong Liu,Chuanzeng Huang

2024 18th International Workshop on Acoustic Signal Enhancement (IWAENC)（2024）

引用 0|浏览0

暂无评分

摘要

The compression of audio signals plays a crucial role in audio storage and transmission, particularly within the context of streaming media applications, where bandwidth utilization is a dominant factor of cost. Motivated by this challenge, our objective is to continuously enhance the compression rate while simultaneously ensuring the retention of audio quality. In this paper, we present a diffusion-based codec sDiff-Codec, which is a state-of-the-art, high-fidelity neural audio codec. The condition module and generator module serve the role of encoder and decoder in sDiff-Codec, the sound quality enhancement task becomes audio compression task. Additionally, we employed a hybrid quantizer to quantize the latent information using a hyper-prior model, the hyper-prior model is to generate prior auxiliary information of the entropy model. The experiment results show that sDiff-Codec is superior compared with the baseline methods under scenarios when monophonic audio signal bitrate ranges from 16 kbps to 192 kbps.

查看译文

关键词

Neural audio codec,score-based diffusion,hybrid quantizer,condition network,generator network

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要