Article Preview
TopIntroduction
With rich and unique features (Cao et al., 2016), hyperspectral images (HSIs) have been widely used for analysis and scene applications such as remote sensing (Deng et al., 2023), precision agriculture (Ishida et al., 2018), national security (Udin et al., 2019), environmental protection (Wright et al., 2019), and astronomical observations (De Angelis et al., 2015). In computer vision, HSIs can be extensively used for object tracking (Li et al., 2022; Kim et al., 2012), material classification (Yu et al., 2022; Hong et al., 2022), feature extraction (Li et al., 2020), and medical image analysis (Liu et al., 2019).
To obtain spectral images, traditional methods typically scan scenes along the one-dimensional (1D) or two-dimensional (2D) spatial dimension, or along spectral channels, sacrificing time through multiple exposures to reconstruct the spectral data of the scene. Although traditional methods perform well in terms of spectral detection range and accuracy (Wang et al., 2021), they are unsuitable for dynamic detection and therefore consumer applications. Recently, researchers have used developments in compressed sensing (CS) theory to collect HSIs using snapshot compressed imaging (SCI) systems (Du et al., 2009), which combine information from snapshots throughout the process of reducing the spectral dimensionality to a single 2D observation. Coded aperture snapshot spectrum imaging (CASSI) (Wagadarikar et al., 2008) is considered a very promising field among the current SCI systems.
Although the CASSI technology modulates the spectral three-dimensional (3D) cube with a coded mask before dispersing it, the redundant picture information can be used to recreate the entire 3D cube. Spectral reconstruction methods are categorized into four types: conventional, deep unfolding network (DU), end-to-end (E2E), and plug-and-play (PnP).
Traditional methods perform reconstructions based on over-complete dictionaries or sparse spectral features that rely on hand-crafted priors and assumptions (Zhang et al., 2019; Wang et al., 2016). The primary limitation of these conventional techniques is the requirement for human parameter adjustment, leading to inadequate resilience and sluggish repair. Deep learning approaches have shown significant prowess in recent years in image production (Qian et al., 2022; Yu et al., 2018; Li et al., 2019; Chopra et al., 2022), image retrieval (Nhi et al., 2022; Chu et al., 2022; Wang et al., 2020), image-semantic analysis (Hu et al., 2022), image classification (Ghoneim et al., 2018; Mandle et al., 2022) and reconstruction (Arnab et al., 2021), such as image denoising, image super-resolution, and rain and fog removal (Jia et al., 2023; Liu et al., 2022; Liang et al., 2022), and have also been applied to spectral image reconstruction. PnP introduces a denoising module based on the traditional method, but with limited improvement in reconstruction speed and accuracy. The current state-of-the-art (SOTA) methods all come from E2E and DU. The E2E directly establishes the mapping between the measurement and truth data, and the DU uses a depth module to simulate the iterations in a convex optimization algorithm. Although both E2E and DU have achieved good performance, there are still limitations to the current methods.