Practical Binary Code Similarity Detection with BERT-based Transferable Similarity Learning.

ACSAC(2022)

引用 6|浏览31
暂无评分
摘要
Binary code similarity detection (BCSD) serves as a basis for a wide spectrum of applications, including software plagiarism, malware classification, and known vulnerability discovery. However, the inference of contextual meanings of a binary is challenging due to the absence of semantic information available in source codes. Recent advances leverage the benefits of a deep learning architecture into a better understanding of underlying code semantics and the advantages of the Siamese architecture into better BCSD. In this paper, we propose BinShot, a BERT-based similarity learning architecture that is highly transferable for effective BCSD. We tackle the problem of detecting code similarity with one-shot learning (a special case of few-shot learning). To this end, we adopt a weighted distance vector with a binary cross entropy as a loss function on top of BERT. With the prototype of BinShot, our experimental results demonstrate the effectiveness, transferability, and practicality of BinShot, which is robust to detecting the similarity of previously unseen functions. We show that BinShot outperforms the previous state-of-the-art approaches for BCSD.
更多
查看译文
关键词
Binary Analysis, Similarity Detection, Deep Neural Network
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要