Supervised sentiment analysis in multilingual environments.

David Vilares,Miguel A. Alonso,Carlos Gómez-Rodríguez

Inf. Process. Manage.（2017）

引用 88|浏览73

暂无评分

摘要

This article tackles the problem of performing multilingual polarity classification on Twitter, comparing three techniques: (1) a multilingual model trained on a multilingual dataset, obtained by fusing existing monolingual resources, that does not need any language recognition step, (2) a dual monolingual model with perfect language detection on monolingual texts and (3) a monolingual model that acts based on the decision provided by a language identification tool. The techniques were evaluated on monolingual, synthetic multilingual and code-switching corpora of English and Spanish tweets. In the latter case we introduce the first code-switching Twitter corpus with sentiment labels. The samples are labelled according to two well-known criteria used for this purpose: the SentiStrength scale and a trinary scale (positive, neutral and negative categories). The experimental results show the robustness of the multilingual approach (1) and also that it outperforms the monolingual models on some monolingual datasets.

查看译文

关键词

Sentiment analysis,Multilingual,Code-Switching

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要