Attention-based multi-modal fusion sarcasm detection

Jing Liu,Shengwei Tian,Long Yu,Jun Long,Tiejun Zhou,Bo Wang

Journal of Intelligent & Fuzzy Systems（2023）

引用 0|浏览7

暂无评分

摘要

Sarcasm is a way to express the thoughts of a person. The intended meaning of the ideas expressed through sarcasm is often the opposite of the apparent meaning. Previous work on sarcasm detection mainly focused on the text. But nowadays most information is multi-modal, including text and images. Therefore, the task of targeting multi-modal sarcasm detection is becoming an increasingly hot research topic. In order to better detect the accurate meaning of multi-modal sarcasm information, this paper proposed a multi-modal fusion sarcasm detection model based on the attention mechanism, which introduced Vision Transformer (ViT) to extract image features and designed a Double-Layer Bi-Directional Gated Recurrent Unit (D-BiGRU) to extract text features. The features of the two modalities are fused into one feature vector and predicted after attention enhancement. The model presented in this paper gained significant experimental results on the baseline datasets, which are 0.71% and 0.38% higher than that of the best baseline model proposed on F1-score and accuracy respectively.

查看译文

关键词

Multi-modal,sarcasm detection,Attention,ViT,D-BiGRU

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要