Building Comparable Corpora

Synthesis lectures on human language technologies(2023)

引用 16|浏览1
暂无评分
摘要
In a parallel corpus we know which document is a translation of what by design. If the link between documents in different languages is not known, it needs to be established. In this chapter we will discuss methods for measuring document similarity across languages and how to evaluate the results. Then, we will proceed to discussing methods for building comparable corpora of different degrees of comparability and for different tasks.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要