Chrome Extension
WeChat Mini Program
Use on ChatGLM

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network

Computing Research Repository (CoRR)(2020)

South China Univ Technol | Univ Adelaide | Huawei Noahs Ark Lab

Cited 467|Views341
Abstract
Scene text detection and recognition has received increasing research attention. Existing methods can be roughly categorized into two groups: character-based and segmentation-based. These methods either are costly for character annotation or need to maintain a complex pipeline, which is often not suitable for real-time applications. Here we address the problem by proposing the Adaptive Bezier-Curve Network (\BeCan). Our contributions are three-fold: 1) For the first time, we adaptively fit oriented or curved text by a parameterized Bezier curve. 2) We design a novel BezierAlign layer for extracting accurate convolution features of a text instance with arbitrary shapes, significantly improving the precision compared with previous methods. 3) Compared with standard bounding box detection, our Bezier curve detection introduces negligible computation overhead, resulting in superiority of our method in both efficiency and accuracy. Experiments on oriented or curved benchmark datasets, namely Total-Text and CTW1500, demonstrate that \BeCan achieves state-of-the-art accuracy, meanwhile significantly improving the speed. In particular, on Total-Text, our real-time version is over 10 times faster than recent state-of-the-art methods with a competitive recognition accuracy. Code is available at \url{https://git.io/AdelaiDet}.
More
Translated text
Key words
ABCNet,scene text spotting,adaptive Bezier-curve network,text instance,character-based group,segmentation-based group,convolution features extraction
PDF
Bibtex
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
  • Pretraining has recently greatly promoted the development of natural language processing (NLP)
  • We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
  • We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
  • The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
  • Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Try using models to generate summary,it takes about 60s
Must-Reading Tree
Example
Using MRT to find the research sequence of this paper
Related Papers
Ednawati Rainarli, Suprapto, Wahyono
2021

被引用8 | 浏览

Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper

要点】:论文提出了一种自适应贝塞尔曲线网络(ABCNet),用于实时场景文本检测与识别,通过参数化贝塞尔曲线适应定向或曲线文本,提高了效率和精度。

方法】:作者设计了一种新颖的BezierAlign层来提取任意形状文本实例的准确卷积特征,并通过参数化贝塞尔曲线拟合文本,优化了检测流程。

实验】:论文在Total-Text和CTW1500数据集上进行了实验,结果表明ABCNet在保持领先准确度的同时,速度比现有最佳方法快10倍以上。