谷歌浏览器插件
订阅小程序
在清言上使用

A Study of the Tibetan Linguistic Picture of the World Using Computer Ontology

REVUE D ETUDES TIBETAINES(2020)

引用 0|浏览1
暂无评分
摘要
he research presented in this article is a summary of several research projects aimed at the creation of a full-scale natural language processing engine based on a consistent formal model of Tibetan vocabulary, grammar, and semantics, verified by and developed on the basis of a representative, hand-tested corpus of texts. The Basic Corpus of Classical Tibetan2 and the Corpus of Indigenous Tibetan Grammar Treatises 3 comprise 34,000 and 48,000 tokens, 4 respectively. Tibetan texts are represented both in the Tibetan Unicode script and in standard Wylie romanization.5 These corpora are developed, annotated, and tested manually by Tibetologists, and in this sense, are unique. The ultimate goal of our project is to create a formal model (a grammar and a linguistic ontology) of the Tibetan language, including morphosyntax, syntax of phrases, hyperphrase unities,6 and semantics, that can produce a correct morpho-syntactic, syntactic, and semantic annotation of the corpora without any further manual corrections. This study is based on the technologies and tools of the AIIRE project.7 AIIRE8 is a free open-source natural language understanding
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要