Article Preview
TopIntroduction
Agriculture directly contributes about 2.5% towards the gross domestic product (GDP) of South Africa (Greyling, 2015), with another 14% contributed through related manufacturing and processing (World Wide Fund for Nature 2018). Fruits and vegetables, including grapes, make up 50.8% of food production, with about 90% produced under irrigation (Tibane, 2016). Being able to accurately assess the area covered by crops (by creating crop maps) is vital to government and agricultural-related agencies (Myburgh, 2015; Yalcin & Günay, 2016). Digital crop maps are often used to obtain agriculture statistics such as crop yield, water stress and soil properties, and can be used in agricultural regions to aid decision making (Delenne, Durrieu, Rabatel, & Deshayes, 2010; Lee et al., 2010; Turker & Kok, 2013; Van Niekerk et al., 2018).
The traditional approach to mapping field boundaries is to manually digitize them from aerial or satellite imagery. However, manual digitizing is time-consuming, labour intensive, costly, subjective and open to human error (Yalcin & Günay, 2016). A variety of semi-automated image classification techniques has consequently been attempted to improve efficiencies and reduce costs (Yan, Shaker, & El-Ashmawy, 2015). Machine learning algorithms are increasingly being used for differentiating crop types from satellite imagery (Gilbertson, Kemp, & Van Niekerk, 2017; Möller et al., 2016). These non-parametric algorithms are robust under high dimensionality (i.e. large number of input variables) and are able to deal with non-normal distributed data (Al-doski, Mansor, Zulhaidi, & Shafri, 2013; Gilbertson & Van Niekerk, 2017). Popular machine learning algorithms include decision tree (DT), neural network (NN), random forest (RF), k-nearest neighbour (k-NN) and support vector machine (SVM) (Al-doski et al., 2013).
DT recursively separates a dataset into smaller subdivisions according to defined tests at each branch (node) in the tree (Friedl & Brodley 1997). The DT consists of a start node, a set of internal nodes and a set of end nodes (leaves). The starting node is created using the entire dataset and splits (based on the value of one variable) it into internal nodes, each representing a class. When an internal node represents a single class, it is turned into a leaf (end) node (Rutkowski et al. 2014). RF is an ensemble classifier consisting of multiple DTs. The DTs are generated on a subset of training samples through replacement (bootstrap aggregation, i.e. bagging). The final classification is an average of all the classifications produced by the different DTs (Möller et al. 2007). Extreme gradient boosting (XGBoost) is an extension of traditional boosting ensemble techniques that are part of the DT family. Boosting sequentially generates models and then combines the weaker performing classifiers into one strong model that constantly improves on the previous classifier errors (Xia et al. 2017). k-NN is a distance-based classification algorithm (Weinberger, Blitzer & Saul 2006) that assigns a label to the unknown sample based on the labels of the training samples that are closest in feature space (Adejuwon & Mosavi 2010). Logistic regression (LR) is a linear model used for classification. It finds a multivariate regression relationship between a dependent variable and several independent variables (Pradhan 2010). Naïve bayes (NB) is a probabilistic classifier based on Bayes’ theorem from Bayesian statistics (Zelinsky 2009), while SVM is a non-parametric supervised classification algorithm that builds a model by mapping the training dataset into higher dimensional space. It then attempts to separate the different classes using hyperplanes that minimizes classification errors (Zheng et al. 2015). NN is modelled after the constructs of the human brain, where “intelligence” is stored in neural pathways as well as in memory. In NN the knowledge is stored in weights applied to each node (neurons) (Miller, Kaminsky & Rana 1995). Another form of an NN is a deep neural network (d-NN), which refers to a multi-layered NN.