Stabilizing the Convolution Operations for Neural Network-Based Image and Video Codecs for Machines.

2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)(2023)

引用 0|浏览1
暂无评分
摘要
Deep convolutional neural networks are generally trained in the floating-point number format. However, the convolution operation in the floating-point domain suffers from numerically unstable behavior due to the limitation of the precision and range of the number format. For deep convolutional neural network-based image/video codec, the instability may cause corrupted reconstructions when the decoder works in a different computing environment. This paper proposes a post-training quantization technique where the convolution operations are performed in the integer domain while other operations are in the floating-point domain. We derived the optimal scaling factors and bits allocation strategy for the input tensor and kernel weights. With the derived scaling factors, the codec can use the significant bits of the single-precision floating-point number for the convolution operations, which does not require the system to support integer operations. Experiments on a learned image codec on machine consumption show that the proposed method achieves the similar performance as the floating-point version while achieving stable behavior on different platforms.
更多
查看译文
关键词
Convolution,learned image codec,numerical stability
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要