Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis
CVPR 2024(2024)
摘要
While replacing Gaussian decoders with a conditional diffusion model enhances
the perceptual quality of reconstructions in neural image compression, their
lack of inductive bias for image data restricts their ability to achieve
state-of-the-art perceptual levels. To address this limitation, we adopt a
non-isotropic diffusion model at the decoder side. This model imposes an
inductive bias aimed at distinguishing between frequency contents, thereby
facilitating the generation of high-quality images. Moreover, our framework is
equipped with a novel entropy model that accurately models the probability
distribution of latent representation by exploiting spatio-channel correlations
in latent space, while accelerating the entropy decoding step. This
channel-wise entropy model leverages both local and global spatial contexts
within each channel chunk. The global spatial context is built upon the
Transformer, which is specifically designed for image compression tasks. The
designed Transformer employs a Laplacian-shaped positional encoding, the
learnable parameters of which are adaptively adjusted for each channel cluster.
Our experiments demonstrate that our proposed framework yields better
perceptual quality compared to cutting-edge generative-based codecs, and the
proposed entropy model contributes to notable bitrate savings.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要