From Word Embedding to Cyber-Phrase Embedding: Comparison of Processing Cybersecurity Texts

Moumita Das Purba,Bill Chu,Ehab Al-Shaer

2020 IEEE International Conference on Intelligence and Security Informatics (ISI)(2020)

引用 1|浏览10
暂无评分
摘要
Much of the vital information about emerging threats and the corresponding defensive measures are contained in large volumes of natural language texts online. Capturing such actionable intelligence in real-time is critical to prevent large scale attacks automatically. The ATT&CK framework is a widely recognized standard to catalog technical details of cyber threats and deploy mitigating measures. A technique in ATT&CK specifies a set of adversary actions to achieve a particular goal, such as Exfiltration over Command and Control channel. Details of the technique include encrypted traffic and encoded data. A key challenge in identifying such cyber intelligence from natural language texts is that for a given action, such as encrypted traffic, many alternative expressions are possible (e.g., send using a self-signed certificate, send using HTTPS requests). It is not practical to manually provide an exhaustive list of all such variants. We demonstrate that using cyber-phrase embedding on a cybersecurity text corpus is a promising approach to overcome such difficulties. Our evaluation demonstrates that our model outperforms existing models. We have created an open-source project to make our tools and data available for the cybersecurity research community.
更多
查看译文
关键词
Cyber threat intelligence,Word Embedding,Text mining,NLP,Cyber attack,MITRE ATT,CK Framework
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要