Recognition and Analysis of Scene-Emotion in Photographic Works Based on AI Technology

Recognition and Analysis of Scene-Emotion in Photographic Works Based on AI Technology

Wenbin Yang
DOI: 10.4018/IJITSA.326055
Article PDF Download
Open access articles are freely available for download

Abstract

Emotional effect is highly subjective in people's cognitive process, and a single discrete emotional feeling can hardly support the description of the immersion scene, which also puts forward higher requirements for emotional calculation in photography. Therefore, this article first constructs a photographic scene recognition model, and then establishes a visual emotion analysis model which optimizes the basic structure of vgg19 through CNN, extracts the user's photography situation information from the corresponding image metadata, establishes the mapping relationship between situation and emotion, and obtains the low-dimensional dense vector representation of the situation features through embedding. The authors divided eight emotional categories; accuracy of the model is compared and the feature distribution of scene-emotion in different works is analyzed. The results show that the accuracy of the scene-emotion recognition model of photographic works after multimodal fusion is high, reaching 73.9%, in addition, different shooting scenes can distinguish the emotional characteristics of works.
Article Preview
Top

1. Introduction

Through the analysis and modeling of information such as the content identification of real-time photography scenes and the emotional preferences of users, combined with cutting-edge technologies such as image understanding and text generation in deep learning, the emotional state of users and the content of photography scenes can be accurately analyzed (Wei et al, 2022). while the existing sentiment analysis mostly starts with the texts produced by users on the Internet, which uses natural language processing and other technologies for analysis (Yu et al, 2022; Chatterjee, 2019). With the great improvement of the ability of convolution neural network to process image information, there are more researches on analyzing users' emotions through photographic works, which achieves good emotion classification results (Rao et al, 2016; Meng et al, 2021). Different from the tasks of object recognition and scene recognition, the task of visual emotion analysis involves more complicated factors, in addition to the image, due to the influence of individual factors (including growth environment, cultural background, social background, etc.), different people have diverse emotional understandings for the same image (Burkitt, 2002). Therefore, it is necessary to consider richer elements as much as possible in visual emotional analysis.

The deep learning model can input the texture, color and other information of the image, automatically extract the emotional features of the image, and make use of the dependency relationship between the features of different levels to model the learning representation of them, which has achieved good results in the classification of photographic emotions (Li, 2019; Zhu et al. 2019). However, learning photographic features from a global perspective makes it difficult for the convolutional neural network to determine which region or law in a photographic work to fully express the emotion expressed in the shooting process, and to define the influence of regional information on the overall emotion of the work. users' emotion towards photography is subjective, and people's emotion is related to many factors, such as people's environment at that time, the photographed content, etc. The above method of emotion classification only from the image level ignores the contextual information behind the image, but there is abundant emotional information hidden in the scenes. Therefore, it is difficult to accurately capture the fine-grained emotion of users(Bhunia et al, 2022; Li et al, 2020).

Traditional CNN can only analyze a single feature domain, ignoring the contextual information behind the photography. Deep learning methods such as convolution neural network, embedding feature and multi-feature fusion can be used to improve the effect of emotion recognition. Based on the above problems, the main contribution of this study is that

  • (1)

    this paper optimizes the basic structure of vgg19 through CNN, builds a photographic scene recognition model and the visual emotion analysis model, which establishes the mapping relationship between the scene and emotion;

  • (2)

    The information of user photography situation is extracted from the corresponding image metadata, and the mapping relationship between situation and emotion is established. The low-dimensional dense vector representation of situation features is obtained through embedding;

  • (3)

    The features of photographic works and the contextual features behind them can be fully integrated, so as to expand the feature domain of data.

Complete Article List

Search this Journal:
Reset
Volume 17: 1 Issue (2024)
Volume 16: 3 Issues (2023)
Volume 15: 3 Issues (2022)
Volume 14: 2 Issues (2021)
Volume 13: 2 Issues (2020)
Volume 12: 2 Issues (2019)
Volume 11: 2 Issues (2018)
Volume 10: 2 Issues (2017)
Volume 9: 2 Issues (2016)
Volume 8: 2 Issues (2015)
Volume 7: 2 Issues (2014)
Volume 6: 2 Issues (2013)
Volume 5: 2 Issues (2012)
Volume 4: 2 Issues (2011)
Volume 3: 2 Issues (2010)
Volume 2: 2 Issues (2009)
Volume 1: 2 Issues (2008)
View Complete Journal Contents Listing