Annual Computer Security Applications Conference (ACSAC) 2022

Full Program »

Practical Binary Code Similarity Detection with BERT-based Transferable Similarity Learning

Binary code similarity detection serves as a basis for a wide spectrum of applications, including software plagiarism, malware classification, and known vulnerability discovery. However, the inference of contextual meanings of a binary is challenging due to the absence of semantic information available in source codes. Recent advances leverage the benefits of a deep learning architecture into a better understanding of underlying code semantics and the advantages of the Siamese architecture into better code similarity detection. In this paper, we propose BinShot, a BERT-based similarity learning architecture that is highly transferable for effective binary code similarity detection. We tackle the problem of detecting code similarity with one-shot learning (a special case of few-shot learning). To this end, we adopt a weighted distance vector with a binary cross entropy as a loss function on top of BERT. With the prototype implementation of BinShot, our experimental results demonstrate the effectiveness, transferability, and practicality of BinShot, which is robust to detecting the similarity of previously unseen functions.We show that BinShot outperforms the previous state-of-the-art approaches for binary code similarity detection.

Sunwoo Ahn
Seoul National University

Seonggwan Ahn
Seoul National University

Hyungjoon Koo
Sungkyunkwan University

Yunheung Paek
Seoul National University

Paper (ACM DL)

Slides

 



Powered by OpenConf®
Copyright©2002-2023 Zakon Group LLC