Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion

Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion

Guangzhen Zhang, Wangyang Jiang
Copyright: © 2023 |Pages: 16
DOI: 10.4018/IJSWIS.333712
Article PDF Download
Open access articles are freely available for download

Abstract

There are many problems with remote sensing images, such as large data scales, complex illumination conditions, occlusion, and dense targets. The existing semantic segmentation methods for remote sensing images are not accurate enough for small and irregular target segmentation results, and the edge extraction results are poor. The authors propose a remote sensing image segmentation method based on a DCNN and multiscale feature fusion. Firstly, an end-to-end remote sensing image segmentation model using complete residual connection and multiscale feature fusion was designed based on a deep convolutional encoder–decoder network. Secondly, weighted high-level features were obtained using an attention mechanism, which better preserved the edges, texture, and other information of remote sensing images. The experimental results on ISPRS Potsdam and Urban Drone datasets show that compared with the comparison methods, this method has better segmentation effect on small and irregular objects and achieves the best segmentation performance while ensuring the computation speed.
Article Preview
Top

Introduction

Remote sensing images can clearly express the internal geometric structures and rich spatial information of objects (Yuan et al., 2021; Yu et al., 2018; Kemker et al., 2018; Al-Sobbahi & Tekli, 2022). The resolutions of remote sensing images have become higher and higher, the data dimensions (space, time, spectra, etc.) have become larger and larger, and the amount of data has also increased. The mutual occlusion between shadows and targets in remote sensing images leads to small differences between the different types of ground objects (Li et al., 2019; Memos et al., 2018). Improving the quality of features and the accuracy of edge region recognition can result in better image segmentation performance. With the continuous improvement of application requirements, the requirements for accuracy and effectiveness of remote sensing image segmentation in related fields are becoming increasingly high.

The goal of image segmentation is not only to obtain the contours of regions to be segmented but also to accurately mark each pixel in the regions as a representative semantic category (Nhi & Le, 2022; Chu et al., 2022; Mandle et al., 2022). Image segmentation means segmenting a given region based on the position information, spectra, and other shallow features in an image, as well as any abstract semantic features, to obtain the semantic truth value of each pixel. The classification of surface cover depends on the manual operation of professionals (Zheng et al., 2022; Wang et al., 2020; Chopra et al., 2022; Wang et al., 2023). However, manual feature extraction is a shallow feature learning method, with which it is difficult to solve the problem of intra-class difference and inter-class difference. In addition, manual features are usually only applicable to specific tasks and rely on expert knowledge in hyperparameter settings, reducing the universal applicability of traditional segmentation methods in different scenarios. Early traditional methods based on machine learning used classifiers, such as threshold-dependent and support vector machines, to segment different ground objects according to the spectral index, texture, geometry, and other characteristics of images (Bokhovkin & Burnaev, 2019). Compared to artificial methods, these methods are more convenient and efficient, but it is difficult to classify surface coverage accurately. Moreover, in different regions and complex scenes, it is difficult to represent the features of all of the pixels of an object, and the portability of the methods is poor, which means that they are not suitable for massive data processing (Li et al., 2021; Qian et al., 2022).

In recent years, deep learning has become a trend. Compared with traditional manual features, neural networks can extract rich feature information from the original image through multi-level structures. The shallow layers of deep learning models can extract simple features, such as texture, shape, edges, and other information. At a deeper level, more complex and abstract feature information can be extracted, such as the semantic (category) information of the target. The excellent feature representation and learning ability of deep networks enable them to adapt to scene changes and have been successfully applied to remote sensing image segmentation tasks (Anil et al., 2022; Hasib et al., 2021). Therefore, a segmentation method based on a deep convolutional neural network and multiscale feature fusion was proposed. Weighted advanced features were obtained using an attention mechanism, which could better retain the edges, texture, and other information of remote sensing images.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing