谷歌浏览器插件
订阅小程序
在清言上使用

The Chinese-English Bilingual Sentence Alignment Based on Length.

Asian Language Processing(2011)

引用 3|浏览1
暂无评分
摘要
Bilingual sentence pairs are key resource for statistical machine translation. Currently, most of the sentence alignment corpus is between English and French or English and German. And there is little specialized sentence alignment dataset between English and Chinese. So our aim is to create large-scale, high-precision English-Chinese aligned sentences. Length based method is used to align bilingual paragraphs which were extracted from CNKI (China National Knowledge Infrastructure). CNKI is one of largest academic website, and contains huge Chinese-English bilingual paragraph. Our method adapts and combines some approaches, which are based on words and based on hybrid. At last, we choose the best alignment by dynamic programming. The experiments on CNKI dataset showed that the presented method had satisfactory the recall ratio and the precision ratio.
更多
查看译文
关键词
huge chinese-english bilingual paragraph,sentence alignment corpus,precision ratio,method adapts,bilingual sentence pair,bilingual paragraph,specialized sentence alignment dataset,chinese-english bilingual sentence alignment,cnki dataset,recall ratio,best alignment,language translation,natural language processing,dynamic programming
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要