Article Preview
TopIntroduction
In data clusters, especially for data with many different attributes and grouped into multiple groups, the GMM model is considered appropriate for this choice. Therefore, the authors (Wang et al., 2017) have performed optimization of clustering parameters for this model. Moreover, based on the flexibility in clustering as well as the distribution of data, database models also use this model to clustered data.
Aim for increase the efficiency of query processing in relational and object-oriented database models. Most of them perform pre-processing steps such as data clustering, query optimization before query execution, and returns results to the user. For example, for object-oriented databases to increase the efficiency of query processing, the authors proposed a method for discriminating horizontal data based on the C-means fuzzy clustering algorithm (Darabant et al., 2005). Such a fuzzy object-oriented database should also perform the same (FOODB) (Shrivastava, 2013; Yan & Ma, 2013; Alhaji & Arkun, 1993; Kumar et al., 2014; Isran & Israni, 2017; Wedashwara et al., 2015; Yan & Ma, 2013; Pons & Vila, 2013). This paper proposes new approaches such as:
- ▪
Optimize for clustering flexibly using advanced EMC algorithm. The Expectation Maximization Coefficient (EMC) algorithm is improve by the Expectation Maximization (EM) algorithm (Vila & Schniter, 2013; Ahmed et al., 2017; Hao et al., 2014; Jung et al., 2014; Long et al., 2014) by adding step (C). In this (C) step, author use the coefficient of variation to increase the softness in the clustering process. More specifically, author partition the clusters as well as calculate the density distribution of the elements in each cluster based on (coefficient of variation in the distance between elements in a cluster) as efficiently as possible. In addition, the EMC algorithm reduces local optimization and increases global optimization and is covered in section 1.
- ▪
The output of the MEC algorithm as an input to the algorithm for identifying fuzzy interval by applying statistical methods, author use both standard deviation and mean to calculate the upper and lower boundary for fuzzy interval.
- ▪
Finally, this paper proposes methods for optimizing and processing queries based on fuzzy (FOA) (Nguyen et al., 2018; Alhaji & Arkun, 1993; Yan et al., 2014; Kumar et al., 2014) and rules equivalent conversion, in order to increase the efficiency of extracting data based on the fuzzy interval proposed above.
The structure of the paper is divided into the following main parts: part 2 presents about clustering optimization and fuzzy foundations, part 3 presents a query optimization approach based on equivalent transformations and a proposed heuristic algorithm, and the conclusion is stated in part 4.