Pipelining Semantic Expansion and Noise Filtering for Sentiment Analysis of Short Documents – CluSent Method

Felipe Viegas,Sérgio D. Canuto,Washington Cunha,Celso França,Cláudio M. V. de Andrade,Guilherme Fonseca, Ana Machado,Leonardo Rocha,Marcos André Gonçalves

Journal on Interactive Systems（2024）

引用 0|浏览6

暂无评分

摘要

The challenge of constructing effective sentiment models is exacerbated by a lack of sufficient information, particularly in short texts. Enhancing short texts with semantic relationships becomes crucial for capturing affective nuances and improving model efficacy, albeit with the potential drawback of introducing noise. This article introduces a novel approach, CluSent, designed for customized dataset-oriented sentiment analysis. CluSent capitalizes on the CluWords concept, a proposed powerful representation of semantically related words. To address the issues of information scarcity and noise, CluSent addresses these challenges: (i) leveraging the semantic neighborhood of pre-trained word embedding representations to enrich document representation and (ii) introducing dataset-specific filtering and weighting mechanisms to manage noise. These mechanisms utilize part-of-speech and polarity/intensity information from lexicons. In an extensive experimental evaluation spanning 19 datasets and five state-of-the-art baselines, including modern transformer architectures, CluSent emerged as the superior method in the majority of scenarios (28 out of 38 possibilities), demonstrating noteworthy performance gains of up to 14% over the strongest baselines.

查看译文

关键词

Sentiment Analysis,Classification,Natural Language Processing

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要