A Unified Computational Lexicon for Hindi-English Code-Switching

msra(2004)

引用 23|浏览28
暂无评分
摘要
We investigate how lexicons of languages in contact are merged to generate a fused lexicon for the code-mixed variety. Using the HPSG formalism, we develop computational lexicons for Hindi and English, and explore how these can be merged to obtain a fused-lect lexicon. We consider the Hindi- English Code Switching variety (HECS), a stable variety that has resulted from contact between these languages. HECS uses words and larger phrasal constituents from one language with the syntax of the other, with the matrix language being predominantly Hindi. The grammar developed here captures this mixing of the two languages in terms of a unified lexicon that mixes pure English, pure Hindi, and cross-referenced lexical structures based on synset information for the entries. The construct of a hinge word is proposed to capture the cross-linguistic linkages which preserve the HPSG-based head-subcategory schema of the source lexicons. The claim is that the code-switching structures in a bilingual repertoire are triggered by cross-linguistic lexical representations that unify the matrix and embedded lexicons, and that computational mechanisms for handling this mixing can be constructed using the same principles.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要