An Automated Data-driven Machine Intelligence Framework for Mining Knowledge To Classify Fake News Using NLP

Shikha Mundra, J. Vinod Kumar Reddy,Ankit Mundra,Namita Mittal,Ankit Vidyarthi,Deepak Gupta

ACM Transactions on Asian and Low-Resource Language Information Processing（2023）

引用 0|浏览0

暂无评分

摘要

The rapid spread of fake news has become a serious concern over the internet. In recent years, social media platforms are widely used for news consumption. These platforms are excellent for their low-cost accessibility and rapid dissemination of news. Contrariwise, it encourages the rapid propagation of ’fake news,’ or low-quality news containing intentionally misleading content. The quick dissemination of fake news has the potential to have devastating consequences for individuals and society as a whole. Therefore, to overcome this problem, this paper proposed an artificial intelligence framework that incorporates ensembles of deep learning features for the classification of fake news. Deep learning approaches such as Multilayer Perceptron (MLP), Convolutional Neural Networks (CNN), and Bidirectional Long Short Term Memory (BILSTM) have been used to extract local and sequential features. To obtain relevant features at the word level, these approaches are initialized using pretrained GLOVE word embedding, which results in, three base learners as GLOVE+MLP, GLOVE+CNN, and GLOVE+BiLSTM. Moreover, to extract features at the sentence level, Bidirectional Encoder Representations from Transformers (BERT) are adopted, which results in, three more base learners as BERT+MLP, BERT+CNN, BERT+BiLSTM. In total, six models are employed as base learners. Later, predictions from the best of these models are ensembled and performance is computed using ensembling techniques. Overall, we have investigated nine ensembling techniques, including weighted voting, bagging, boosting, stacked ensembles like SVC, and logistic regression. The performance is computed using four publicly available datasets regarding the macro average f1-score. We observed that soft weighted voting-based ensemble outperformed other models on three datasets achieving an f1-score of 92.99% (McIntyre), 95.22% (Kaggle), and 78.3% (Gossipcop).

查看译文

关键词

classify fake news,mining knowledge,nlp,machine intelligence framework,data-driven

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要