Bots and Gender Profiling using Masking Techniques Notebook for PAN at CLEF 2019

Victor Jimenez-Villar, Javier Sánchez-Junquera, Manuel Montes-y-Gómez, Luis Villaseñor-Pineda, Simone Paolo Ponzetto

semanticscholar(2019)

引用 0|浏览1
暂无评分
摘要
This work describes our proposed solution for the author profiling shared task at PAN 2019. The task consists in identifying whether the author of a Twitter feed is a bot or a human, and, in case of a human, in determining if the author is male or female. Like previous years, the task considers different languages, in this case, English and Spanish. Our proposal focuses on the preprocessing and feature extraction steps; we mainly apply some masking techniques that allow emphasizing the relevant terms by obfuscating the irrelevant ones but keeping information about the structure of the texts. Using this approach we obtained accuracies of 0.92 and 0.81 in the Spanish test set for classifying bots/humans and males/females, respectively; similarly, we obtained accuracy values of 0.91 and 0.82 for the English dataset.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要