Article Preview
TopIntroduction
With the rapid evolution of information technology and the pervasive integration of the Internet, the graph data structure has become instrumental in modeling many structured or relational systems. In recent years, convolutional neural networks (CNNs) have gained widespread traction in addressing challenges such as image and computer vision (Jiao et al., 2022), encompassing tasks such as image detection and recognition (Wang & Zhu, 2023; Sobha & Latifi, 2023). Given the inherent regularity of images as grid-based datasets, convolution operations lend themselves to straightforward definitions. However, the real world presents a plethora of data exhibiting irregular graph structures, including social networks, citation networks, and biological networks, where nodes represent entities and edges denote relationships between them. To leverage the potential inherent in such graph-structured data, researchers have devised a formidable deep learning tool known as graph neural networks (GNNs; Ding et al., 2019). GNNs typically adhere to a recursive message passing framework, enabling the extraction of profound insights from graph-structured data. They have garnered significant acclaim for their exceptional performance, demonstrating remarkable success across diverse tasks, such as node classification (Wang et al., 2023), link prediction (Barros et al., 2021), and graph classification (Zhou et al., 2020).
The primary advantage of GNNs lies in their efficacy in processing graph-structured data, capturing intricate relationships among nodes and edges while providing a holistic understanding of a graph’s global structure. Nevertheless, despite their proficiency in processing graph data, GNNs face certain challenges. For instance, they are susceptible to the problem of over-smoothing, whereby node characteristics may become excessively similar during the course of multiple message passing iterations, making it arduous for the model to differentiate between distinct nodes. Furthermore, the computational complexity associated with GNNs poses a challenge, especially when dealing with large-scale graphs. To address these challenges, researchers have proposed numerous enhanced GNN models and optimization techniques, which have achieved noteworthy advancements across various tasks.
However, GNNs also encounter certain challenges, prompting the emergence of data augmentation techniques (Ding et al., 2022). Data augmentation technology enhances the model’s robustness and generalization capability by introducing diverse forms of randomness for the training process. This technology has seen extensive applications such as computer vision and natural language processing. Due to the irregularity and non-Euclidean structure of graph data, it is difficult to directly apply the data augmentation techniques used in computer vision and natural language processing (Gao et al., 2022; Nadkarni et al., 2011; Wang et al., 2023; Bastings et al., 2017; Zhang et al., 2022) to the field of graphs (Bayer et al., 2023; Tsaregorodtsev & Belagiannis, 2023; Sultana & Ohashi, 2023). Consequently, researchers have turned their attention to developing GNN-based methods for data augmentation to address these challenges (Xia et al., 2021; Gaudelet et al., 2021).