PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution
CVPR 2024(2024)
摘要
Recently, some large kernel convnets strike back with appealing performance
and efficiency. However, given the square complexity of convolution, scaling up
kernels can bring about an enormous amount of parameters and the proliferated
parameters can induce severe optimization problem. Due to these issues, current
CNNs compromise to scale up to 51x51 in the form of stripe convolution (i.e.,
51x5 + 5x51) and start to saturate as the kernel size continues growing. In
this paper, we delve into addressing these vital issues and explore whether we
can continue scaling up kernels for more performance gains. Inspired by human
vision, we propose a human-like peripheral convolution that efficiently reduces
over 90
and manage to scale up kernel size to extremely large. Our peripheral
convolution behaves highly similar to human, reducing the complexity of
convolution from O(K^2) to O(logK) without backfiring performance. Built on
this, we propose Parameter-efficient Large Kernel Network (PeLK). Our PeLK
outperforms modern vision Transformers and ConvNet architectures like Swin,
ConvNeXt, RepLKNet and SLaK on various vision tasks including ImageNet
classification, semantic segmentation on ADE20K and object detection on MS
COCO. For the first time, we successfully scale up the kernel size of CNNs to
an unprecedented 101x101 and demonstrate consistent improvements.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要