Article Preview
Top1. Trend-Aware Data Imputation Based On Generative Adversarial Network For Time Series
The world is full of multi-variate time series data, and time series analysis has already played an important role in various fields, such as stock price prediction (Li & Yang, 2020), urban applications (Tabassum et al., 2021), geolocation (Chatzigeorgakidis et al., 2020), financial data modelling (Dogariu et al., 2022), satellite monitoring (Yuan et al., 2023), fault anomaly detection (Patel et al., 2022), and IoT device maintenance (Alghamdi et al., 2022). Time series, however, are often incomplete for equipment fault, transmission error, human factor, and for other reasons, which affects the effectiveness of data analysis.
Traditional methods of data imputation mainly fall into two categories: deletion-based method and filling-based method. The deletion-based method creates the illusion of no missing values by deleting missing samples, which can ensure the integrity of the remaining data, but it causes a decrease in the scale of samples and transforms the deleted samples from partial missing state to complete missing state. Therefore, it is not applicable to time series with continuous changing trend (Xu et al., 2020). The filling-based method fills the missing data by generating new values, and it can be further subdivided into statistical-based method and machine learning-based method. There are many commonly used statistics-based methods, such as mean imputation (Wolbers et al., 2022), last observation carried forward (Sampoornam et al., 2022), median imputation (Hadeed et al., 2020), plural imputation (Memon et al., 2022), random imputation (Guillaume & Wilfried, 2018), next observation carried backward (Wu et al., 2022), Lagrange imputation (Essanhaji & Errachid, 2022), and so on. Meanwhile, the main technologies used in the machine learning-based methods include clustering (Lashmar et al., 2021), linear regression (Vance et al., 2022), matrix decomposition (Feng et al., 2023), correlation analysis (Zhang et al., 2021), and multiple imputation (Aleryani et al., 2022). These methods mainly focus on the processing of missing values of non-time series.
Time series is a chronologically arranged sequence of numerical data points (Ren et al., 2021) and has seen extensive applications in various domains of our daily lives, especially in industrial scenarios. Since time series is generally generated by end-users, edge devices, and different wearable devices, it is more inevitable for time series to suffer missing values.