Article Preview
TopIntroduction
The refinery mainly produces gasoline, aviation coal, diesel oil, asphalt, polyethylene, polypropylene, polyvinyl chloride, acrylonitrile, butanol, caustic soda, benzene, and other chemicals, and a focus has been placed on the safety of personnel working in hot work. At present, enterprises mainly use mobile individual soldiers (Abdi & Williams, 2010; Azami et al., 2019) supplemented by cameras to shoot on-site operations and arrange inspectors for on-site supervision, which can realize real-time uploading of construction site videos. However, the amount of surveillance videos is huge, and the time and energy of inspection team personnel are limited, so they cannot supervise for extended lengths of time (Barz & Denzler, 2021). Because of this, they can only adopt centralized random inspection for supervision. It is difficult to detect violations and find hidden risks through manual screening. The video platform is also unable to provide early warning and real-time communication, and there is no early warning means at the operation site to remind the construction personnel in time(Bo & Jiulun, 2022). It can only be traced after the fact, which increases the workload of inspectors and the probability of safety hazards.
With the development of computer technology, artificial intelligence technology has become more and more mature (Díaz et al., 2021; Feng et al., 2019; Gordo et al., 2016; Huang et al., 2015). Deep learning consists of deep neural networks, which consist of an input layer, multiple hidden layers, and an output layer. Each layer has several neurons, and there is a connection weight between neurons. Each neuron mimics a human nerve cell, and the connections between nodes mimic those between nerve cells (Krizhevsky et al., 2012; Lai & Chen, 2011). Deep learning uses layer by layer training mechanisms to get the initial values of the parameters of each layer. Then, backpropagation is used to compute the gradient of the target function. In deep networks (above 7 layers), the spread of residuals to the first layer will become small, in what is called gradient diffusion. The deep neural network model is complex and requires a large amount of training data and computation. On the one hand, deep neural networks are a simulation of the human brain, and mathematically, every neuron in a deep neural network is including activation functions (such as Sigmoid, ReLU, or Softmax functions). The number of parameters to estimate is also extremely large. In speech recognition and image recognition applications, there are tens of thousands of neurons and tens of millions of parameters. Such a model is very complex, and it requires a large amount of computation to solve such a model. Deep neural networks, on the other hand, require large amounts of data to train models with high accuracy (Li et al., 2021; Lowe, 2004; Manjunath & Ma, 1996; Menghao & Hongwei, 2019).