Impact of Transformers on Multilingual Fake News Detection for Tamil and Malayalam

Speech and Language Technologies for Low-Resource Languages (2023)

引用 0|浏览4
暂无评分
摘要
Due to the availability of the technology stack for implementing state of the art neural networks, fake news or fake information classification problems have attracted many researchers working on Natural Language Processing, machine learning, and deep learning. Currently, most works on fake news detection are available in English, which has confined its widespread usability outside the English-speaking population. As far as multilingual content is considered, the fake news classification in low-resource languages is challenging due to the unavailability of enough annotated corpus. In this work, we have studied and analyzed the impact of different transformer-based models like multilingual BERT, XLMRoBERTa, and MuRIL for the dataset created (translated) as a part of this research on multilingual low-resource fake news classification. We have done various experiments, including language-specific and different models, to see the impact of the models. We also offer the multilingual dataset in Tamil and Malayalam, which are from multiple domains that could be useful for research in this direction. We have made the datasets and code available in Github ( https://github.com/hariharanrl/Multilingual_Fake_News ).
更多
查看译文
关键词
Fake News, XLM-RoBERTa, M-BERT, Low-Resource
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要