WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit

Binbin Zhang,Di Wu,Chao Yang,Xiaoyu Chen,Zhendong Peng, Xiangming Wang,Zhuoyuan Yao,Xiong Wang,Fan Yu,Lei Xie,Xin Lei

arxiv（2021）

引用 3|浏览31

暂无评分

摘要

In this paper, we present a new open source, production first and production ready end-to-end (E2E) speech recognition toolkit named WeNet. The main motivation of WeNet is to close the gap between the research and the production of E2E speech recognition models. WeNet provides an efficient way to ship ASR applications in several real-world scenarios, which is the main difference and advantage to other open source E2E speech recognition toolkits. This paper introduces WeNet from three aspects, including model architecture, framework design and performance metrics. Our experiments on AISHELL-1 using WeNet, not only give a promising character error rate (CER) on a unified streaming and non-streaming two pass (U2) E2E model but also show reasonable RTF and latency, both of these aspects are favored for production adoption. The toolkit is publicly available at https://github.com/mobvoi/wenet.

查看译文

关键词

speech,wenet,recognition,end-to-end

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要