AST2Vec: A Robust Neural Code Representation for Malicious PowerShell Detection

SCIENCE OF CYBER SECURITY, SCISEC 2023(2023)

引用 0|浏览2
暂无评分
摘要
In recent years, PowerShell has become a commonly used carrier to wage cyber attacks. As a script, PowerShell is easy to obfuscate to evade detection. Thus, they are difficult to detect directly using traditional anti-virus software. Existing advanced detection methods generally recover obfuscated scripts before detection. However, most deobfuscation tools can not achieve precise recovery on obfuscated scripts due to emerging obfuscation techniques. To solve the problem, we propose a robust neural code representation method, namely AST2Vec, to detect malicious PowerShell without de-obfuscating scripts. 6 Abstract Syntax Tree (AST) recovery-related statement nodes are defined to identify obfuscated subtrees. Then AST2Vec splits the large AST of entire PowerShell scripts into a set of small subtrees rooted by these 6 types of nodes and performs tree-based neural embeddings on all extracted subtrees by capturing lexical and syntactical knowledge of statement nodes. Based on the sequence of statement vectors, a bidirectional recursive neural network (Bi-RNN) is modeled to leverage the context of statements and finally produce vector representation of scripts. We evaluate the proposed method for malicious PowerShell detection through extensive experiments. Experimental results indicate that our model outperforms the state-of-the-art approaches.
更多
查看译文
关键词
PowerShell scripts,malware detection,abstract syntax trees,representation learning,Bi-RNN
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要