Integrated Design of Building Environment Based on Image Segmentation and Retrieval Technology

Integrated Design of Building Environment Based on Image Segmentation and Retrieval Technology

Zhou Li, Hanan Aljuaid
DOI: 10.4018/IJITSA.340774
Article PDF Download
Open access articles are freely available for download

Abstract

Existing models still exhibit a deficiency in capturing more detailed contextual information when processing architectural images. This paper introduces a model for architectural image segmentation and retrieval based on an image segmentation network. Primarily, spatial attention is incorporated into the U-Net segmentation network to enhance the extraction of image features. Subsequently, a dual-path attention mechanism is integrated into the U-Net backbone network, facilitating the seamless integration of information across different spaces and scales. Experimental results showcase the superior performance of the proposed model on the test set, with average dice coefficient, accuracy, and recall reaching 94.67%, 95.61%, and 97.88%, respectively, outperforming comparative models. The proposed model can enhance the U-Net network's capability to identify targets within feature maps. The amalgamation of image segmentation networks and attention mechanisms in artificial intelligence technology enables precise segmentation and retrieval of architectural images.
Article Preview
Top

Prior to the integration of image-segmentation and retrieval technology into the realms of architectural design and urban development, architects seeking a comprehensive understanding of the overall design schemes and aesthetic styles of integrated architectural environments typically engaged in discussions with peers and consulted relevant literature. However, these conventional methods proved inadequate in meeting the sensory requirements of urban design concerning architectural style and green environments. With the introduction and application of image-segmentation and -retrieval technology, a novel solution has emerged for this challenging issue. For architects, the segmentation and retrieval of architectural images offer a superior means of acquiring relevant knowledge about integrated architectural environment design, thereby propelling the development of green cities. Consequently, the key focus shifts to the construction of an intelligent and efficient architectural image-segmentation and -retrieval model.

In recent years, the application of deep learning–based semantic image segmentation has become widespread across various domains. This approach is employed primarily to address issues such as fuzzy boundaries, low precision, and low resolution in images. When image-segmentation techniques are applied to architectural images, the model is expected not only to accurately delineate specific architectural features and refine architectural categories but also to assist designers in obtaining more-precise design solutions.

Deep learning–based semantic segmentation of images (Ulku and Akagündüz, 2022; Hemamalini et al., 2022) has witnessed widespread adoption across various domains, effectively addressing issues such as fuzziness and low resolution in images. Erdi et al. (1997) introduced an end-to-end neural network for semantic image segmentation. Li et al. (2019) proposed a U-Net network structure based on fully convolutional networks (FCNs), better suited for fine image processing. Unlike the summation mechanism of FCN, U-Net utilizes multiple upsampling and downsampling operations to gradually acquire high-level semantic information. It also incorporates jump connections (stitching dimensions of the same channels together), thereby enhancing feature fusion and significantly improving segmentation performance. While U-Net has demonstrated success in image segmentation, its limitations in extracting detailed contextual information have led to the proposal of new structures with U-Net as a variant. For instance, Duan et al. (2018) designed a lightweight SegNet model, introducing a novel upsampling method for efficient image segmentation.

The UNet++ network, an extension of U-Net, represents a notable breakthrough in image-segmentation technology. This network efficiently addresses the adaptive selection of sampling depth among different samples, accelerating the extraction of feature information at various levels. However, it comes with a drawback of an abrupt increase in the number of model parameters, leading to heightened computational costs and a significant demand for GPU resources (Zhou et al., 2018). As network models deepen, Tan et al. (2021) proposed an AcuNet network, utilizing depth-separable convolution to reduce model parameters. Trebing et al. (2021) introduced an At-UNet segmentation network, incorporating an attention mechanism based on U-Net and employing depth-wise convolution instead of traditional convolution. Cao and Zhang (2020) proposed an updated Res-UNet model for high-resolution image segmentation. He et al. (2020) presented a hybrid attention approach for effective architecture segmentation. Zhao et al. (2022) introduced an Inception v3–based image-segmentation method to enhance the segmentation accuracy of small target images effectively. Zhao et al. (2017) proposed a pyramid-shaped scene-parsing network, integrating contextual data and fully exploiting global features for semantic segmentation of diverse scenes. He et al. (2017) introduced mask R-CNN for image segmentation, achieving high-quality semantic segmentation while performing target detection.

Complete Article List

Search this Journal:
Reset
Volume 17: 1 Issue (2024)
Volume 16: 3 Issues (2023)
Volume 15: 3 Issues (2022)
Volume 14: 2 Issues (2021)
Volume 13: 2 Issues (2020)
Volume 12: 2 Issues (2019)
Volume 11: 2 Issues (2018)
Volume 10: 2 Issues (2017)
Volume 9: 2 Issues (2016)
Volume 8: 2 Issues (2015)
Volume 7: 2 Issues (2014)
Volume 6: 2 Issues (2013)
Volume 5: 2 Issues (2012)
Volume 4: 2 Issues (2011)
Volume 3: 2 Issues (2010)
Volume 2: 2 Issues (2009)
Volume 1: 2 Issues (2008)
View Complete Journal Contents Listing