Investigating Hostile Post Detection in Hindi

Varad Bhatnagar,Prince Kumar,Pushpak Bhattacharyya

Neurocomputing（2022）

引用 5|浏览8

暂无评分

摘要

Hostile content on Social Media platforms is becoming a problem for governments and organizations. There is a need for AI based intervention which can filter hostile content at scale. The challenge lies in ambiguity of language, absence of training data and local context. In this paper, we investigate Hostile Post Detection for the Hindi Language, which is the topmost language in the Indian Subcontinent in terms of speaker population and third in the world. We extend our prior work in this area along the dimensions of (i) Representations (ii) Data and (iii) Architecture, exploring approaches like Transformers and Multi Task Learning among others, along the way. In this highly experimental study, comparisons are drawn, trends are discovered and insights are presented. We manage to improve on the baseline by 16.5% and 29.77% on the two evaluation metrics viz. Coarse Grained F1 Score and Fine Grained F1 Score. We are also able to beat our prior work results by 0.93% and 9.18% on these two evaluation metrics respectively. Experiments performed by us number 60 which is larger than the number reported in any other work for Hostility Detection in Hindi, to the best of our knowledge.

查看译文

关键词

Hostile Post,Label Powerset,Binary Relevance,BERT,MTDNN

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要