Indexing Text Related To Software Vulnerabilities In Noisy Communities Through Topic Modelling

2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA)（2018）

引用 5|浏览10

暂无评分

摘要

Despite efforts in the security community to quickly index and disseminate vulnerabilities as they are discovered and addressed, there are concerns about how to scale up the knowledge management of vulnerabilities given its dramatic growth rate. To address these concerns, recent research shifted towards more proactive approaches, in particular leveraging text mining methods to improve vulnerability identification and dissemination to security investigators. While providing a starting point for understanding vulnerability trends, recent methods are still reliant on curated identifiers, such as 'CVE-*', hence missing the majority of cybersecurity activity. We show that we can leverage overlapping textual themes in software vulnerabilities to identify related software vulnerability discussions without prior knowledge of identifiers. Our method obtained 86% accuracy in identifying related vulnerabilities with minimal pre-processing in a noisy community.

查看译文

关键词

cybersecurity, topic-modeling, social-media

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要