Cross-Frame Transformer-Based Spatio-Temporal Video Super-Resolution

Wenhui Zhang,Mingliang Zhou,Cheng Ji,Xiubao Sui,Junqi Bai

IEEE Transactions on Broadcasting（2022）

引用 9|浏览12

暂无评分

摘要

In this paper, we explore the spatio-temporal video super-resolution task, which aims to generate a high-resolution and high-frame-rate video from an existing video with low resolution and frame rate. First, we propose an end-to-end spatio-temporal video super-resolution network chiefly composed of cross-frame transformers instead of traditional convolutions. Especially, the cross-frame transformer module divides the input feature sequence into query, key, value matrixes, and then obtains the maximum similarity and similarity coefficient matrixes between neighboring and current feature maps through self-attention processing operations. Next, we propose a multi-level residual reconstruction module, which could make full use of the maximum similarity and similarity coefficient matrixes obtained by the cross-frame transformer, to reconstruct the high resolution and frame rate results from coarse to fine. Qualitative and quantitative evaluation results show that our method offers better performance and requires fewer training parameters compared with the existing two-stage network.

查看译文

关键词

Superresolution,Feature extraction,Transformers,Image reconstruction,Interpolation,Convolution,Task analysis,Transformer network,spatio-temporal video super-resolution,cross-frame transformer module,multi-level residual reconstruction,self-attention,video frame interpolation

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要