Golden Eye: An OS-Independent Algorithm for Recovering Files From Hard-Disk Raw Images

Golden Eye: An OS-Independent Algorithm for Recovering Files From Hard-Disk Raw Images

Fan Zhang, Wei Chen, Yongqiong Zhu
Copyright: © 2022 |Pages: 23
DOI: 10.4018/IJDCF.315793
Article PDF Download
Open access articles are freely available for download

Abstract

File systems are important sources of intelligence information and digital evidence. They have long attracted the interest of researchers in recovering files that are deleted from a hard disk. Existing file recovery studies rely heavily on an operating system (OS). However, it is often encountered that OS services are not available, making existing file recovery approaches unusable. To address this issue, the authors design and implement an OS-independent file recovery algorithm named Golden Eye (GE) by targeting the EXT4 file system. Fed the raw image obtained from a (sanitized) hard disk, GE can automatically recover any designated file or even the whole EXT4 file system. GE is based on the understanding of the file disk layout of EXT4 and does not need any support from additional hardware or software. Experimental results prove the feasibility and correctness of GE. This work not only solves the OS dependency problem that most existing file recovery work suffers from but also reveals the fact that even sanitized hard disks are still at risk of leaking sensitive data.
Article Preview
Top

Introduction

File systems are important sources of confidential and private information. Various types of data (e.g., documents, audio, videos, and pictures) are stored in file systems. Data are often intentionally deleted or unintentionally lost, and therefore different methods of file recovery have been developed for various purposes, such as digital forensics and file rescue.

Depending on whether utilizing the file system metadata, existing file recovery approaches can be divided into two categories: Metadata-based file recovery (MFR) (Dewald & Seufert, 2017; Fairbanks, 2012; Jo et al., 2018; Kim et al., 2021; Lee et al., 2020; Lee & Shon, 2014) and carving-based file recovery (CFR) (Garfinkel, 2007; Garfinkel & McCarrin, 2015; Gladyshev & James, 2017; Golden & Vassil, 2005; Hand et al., 2012; Pal et al., 2003; Pal et al., 2008; Tang et al., 2016). MFR is fast and accurate because it can leverage file system metadata to interpret user data. However, MFR cannot work if metadata are missing or corrupted. Different from MFR, CFR does not rely on metadata. It leverages syntactic signatures (e.g., file header-footer pairs) (Tang et al., 2016), semantic structures (e.g., explicit control flow paths within a binary executable) (Hand et al., 2012), heuristic technologies (Garfinkel & McCarrin, 2015; Gladyshev & James, 2017; Pal et al., 2008), timestamps (Nordvik et al., 2020; Portera et al., 2021) or deep learning technologies (Heo et al., 2019; Mohammad & Alqahtani, 2019) to restore files. Unlike MFR, which can precisely recover a file under the “direct guidance” of metadata, CFR “indirectly infers” which data blocks belong to the file to be recovered. Therefore, CFR suffers from problems such as false positives and higher time overhead. In summary, both MFR and CFR have their advantages and disadvantages. They complement each other, and neither can take the place of the other.

Although researchers have conducted in-depth and extensive research, there are still issues to be addressed for MFR and CFR. A critical issue is that most existing approaches rely heavily on services from an operating system (OS) (Fairbanks, 2012; Garfinkel, 2007; Garfinkel & McCarrin, 2015; Golden & Vassil, 2005; Hand et al., 2012; Jo et al., 2018; Kim et al., 2021; Lee et al., 2020; Lee & Shon, 2014; Pal et al., 2003; Pal et al., 2008; Tang et al., 2016). However, in many cases an OS is not available. For example, when a hard disk fails, or a hard disk is sanitized based on American federal NIST 800-88 (Kissel et al., 2014), the hard disk can no longer be mounted to other machines to boot their OSs nor boot its own OS, which renders existing approaches (Fairbanks, 2012; Garfinkel, 2007; Garfinkel & McCarrin, 2015; Golden & Vassil, 2005; Hand et al., 2012; Jo et al., 2018; Kim et al., 2021; Lee et al., 2020; Lee & Shon, 2014; Pal et al., 2003; Pal et al., 2008; Tang et al., 2016;) unusable.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 3 Issues (2022)
Volume 13: 6 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing