Article Preview
Top1. Introduction
Spatial anomalies usually deviate significantly from the global or local distribution and may indicate implicit knowledge, significant changes, surprising patterns and meaningful insight instead of just being useless noises (Shi, Deng, Yang, & Liu, 2017). Because of the attractive means, anomaly detection has become popular and received considerable attention in data mining and image detection in recent years (Khurana, Parthasarathy, & Turaga, 2016; Han, Pei, & Kamber, 2011). The applications cover various domains such as analyzing abnormal distribution of disease and crime (Rosser, Davies, Bowers, Johnson, & Cheng, 2017), discovering extreme climate events (Telang, Deepak, Joshi, Deshpande, & Rajendran, 2014) and detecting traffic congestion (Riveiro, Lebram, & Elmer, 2017), extruding anomalies for hyperspectral images (Du, Zhao, Zhang, & Zhang, 2016), and so on.
Hawkins (2013) firstly defined the concept of the anomaly as “an observation that deviates so much from other observations as to arouse suspicion that it was generated by diffident mechanism”. Then, Shekhar et al. (2003) defined the spatial anomaly as “a spatially referenced entity whose non-spatial attribute values are significantly different from those in its spatial neighborhood”. With advances in spatial data mining, a large number of algorithms were developed to discover the anomaly in spatial datasets. These algorithms can be roughly classified into two categories: those that only consider spatial attributes and those that consider both spatial and non-spatial attributes. In most applications, spatial data have both spatial and non-spatial attributes. Therefore, this article looks at spatial anomaly detection algorithms that consider both spatial and non-spatial attributes.
An anomaly region is also called an anomaly window and denotes a spatial outlying cluster (Liu & Chang, 2013; Akoglu, Tong, & Koutra, 2015). There are two key steps in anomaly region detection which are the determination of spatial neighbors according to geographic information and the difference measurement for a given region and its spatial neighbors with non-spatial attributes. General methods would judge an anomalous region according to a neighborhood. However, the neighborhood may contain outliers or some extreme points. These bad neighbors would give wrong votes. Hence, a wrong score would be given due to the disturbance of bad neighbors.
Discovering the bounds of anomaly regions is another challenge. Because the details of distribution are unknown in general, it is hard to determine terminal condition for the cutting-edge process. The bad cutting-edge methods may lead the regions unacceptable. Generally, if cutting-edge step stops earlier, some anomaly regions would be lost. On the contrary, if cutting-edge step stops later, some anomaly regions would be destroyed. To overcome the problems, the authors propose a region detection method based on Delaunay minimal spanning tree (DMST for short). Our contributions can be summarized as follows.
- 1.
Build the spatial relationship among objects. Based on Delaunay triangle mesh, a weighted graph is constructed to mark differences among objects.
- 2.
Design a new cutting-edge method. Cutting edges and checking anomaly regions are performed at the same time. The outstanding advantage is that the detection result does not rely on the terminal condition of cutting-edge.
- 3.
Validate the high detection performance. The anomaly measurement reveals the local difference well.
The remainder of this article is organized as follows. In Sections 2, the background is discussed. The research strategy and the proposed algorithm are introduced in Section 3. Experiments and analysis are described in Section 4. In Section 5, the findings are summarized.
Top2. Background
Various algorithms have been proposed for spatial anomaly region detection as mentioned above. Those could be classified into two categories with the anomaly distribution: point detection and region detection (when the cardinality of an anomaly region is 1, region detection degenerates into point detection.).