Binary Vulnerability Similarity Detection Based on Function Parameter Dependency

Binary Vulnerability Similarity Detection Based on Function Parameter Dependency

Bing Xia, Wenbo Liu, Qudong He, Fudong Liu, Jianmin Pang, RuiNan Yang, JiaBin Yin, YunXiang Ge
Copyright: © 2023 |Pages: 16
DOI: 10.4018/IJSWIS.322392
Article PDF Download
Open access articles are freely available for download

Abstract

Many existing works compute the binary vulnerability similarity based on binary procedure, which has coarse detection granularity and cannot locate the vulnerability trigger position accurately, and have a higher false positive rate, so a new binary vulnerability similarity detection method based on function parameter dependency in hazard API is proposed. First, convert the instructions of different architectures into an intermediate language, and use the compiler with a back-end optimizer to optimize and normalize the binary procedure. Then, locate the hazard API that appears in the binary procedure, and perform the function parameters dependency analysis to generate a set of parameter slices on the hazard API. Experiments show that the method has a higher recall rate (up to 14.3% better than the baseline model) in real-world scenarios, and not only locates the triggering position of the vulnerability but also identifies the fixed vulnerability.
Article Preview
Top

Introduction

Open-source codes accelerate the development of software systems, and once vulnerable found in classic open-source codes, it may cause large-scale network security problems. According to the “2022 Open Source Security and Risk Analysis Report” provided by Synopsys, by 2022, 97% of code bases contain open source components, and 81% of code bases contain at least one vulnerability. Using the unpatched cloned code to attack will have a serious impact on the software system (Yang., 2019). Due to the need to protect trade secrets or intellectual property rights, the software is usually deployed in binary form. So detecting open-source vulnerabilities in binary files is critical to the security of the software supply chain.

At present, the binary code similarity technology based on the binary procedure has been widely used in binary vulnerability detection. The main idea is to compare the semantic information of the binary code with the semantic information of the binary code carrying the binary vulnerability (Xia et al., 2022). Because the binary vulnerabilities are typically caused by calling hazard API and its incomplete parameter constraints, a vulnerability is triggered not the whole binary procedure code but the part of the code, so the existing binary vulnerability detection based on code similarity of the binary procedure has a larger detection granularity, lead to higher false positive rate in the real-world application scenario. Therefore, we propose a binary vulnerability similarity detection method based on hazard API function union which is composed of hazard API name and API parameter slices, which detects vulnerabilities with finer granularity and effectively reduces the false alarm rate.

The contributions of this paper are as follows:

  • 1.

    A novel vulnerability representation for binary procedures. This representation is based on hazard API name and API parameter slices which can fine-grained locates the triggering position of the vulnerability.

  • 2.

    This representation can be used in a variety of instruction architectures such as X86, ARM, and MIPS. Different architectures of binary procedure are promoted to intermediate language and use back-end optimizer to optimize and normalize the binary procedure, minimizing the semantic gap of architectures.

  • 3.

    ComFU, a complete binary code similarity vulnerability detection system. Experiment results show that it provides better accuracy than previous state-of-the-art systems and has an ability to identify whether the vulnerability patch is fixed.

Top

Background

To quickly discover the known vulnerabilities in the binary code, binary code similarity technology came into being. The main method is to use the neural network to obtain the corresponding semantic information of the binary vulnerability and the target, and compare whether they are similar (Xia et al., 2022). According to the content of the comparison, the existing binary code similarity methods can be divided into two types based on procedure matching and patching.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing