Facial Image Compression via Neural Image Manifold Compression

IEEE Transactions on Circuits and Systems for Video Technology(2023)

引用 0|浏览20
暂无评分
摘要
Although the recent learning-based image and video coding techniques achieve rapid development, the signal fidelity-driven target in these methods leads to the divergence to a highly effective and efficient coding framework for both human and machine. In this paper, we aim to address the issue by making use of the power of generative models to bridge the gap between full fidelity (for human vision) and high discrimination (for machine vision). Therefore, relying on existing pretrained generative adversarial networks (GAN), we build a GAN inversion framework that projects the image into a low-dimensional natural image manifold. In this manifold, the feature is highly discriminative and also encodes the appearance information of the image, named as latent code . Taking a variational bit-rate constraint with a hyperprior model to model/suppress the entropy of image manifold code, our method is capable of fulfilling the needs of both machine and human visions at very low bit-rates. To improve the visual quality of image reconstruction, we further propose multiple latent codes and scalable inversion . The former gets several latent codes in the inversion, while the latter additionally compresses and transmits a shallow compact feature to support visual reconstruction. Experimental results demonstrate the superiority of our method in both human vision tasks, i.e . image reconstruction, and machine vision tasks, including semantic parsing and attribute prediction.
更多
查看译文
关键词
Video Coding for Machine,Generative Adversarial Networks,GAN Inversion,Multiple Latent Codes,Scalable GAN Inversion
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要