Digitally-Signed Video/Audio Streams as Prevention of AI-Based Attacks

Digitally-Signed Video/Audio Streams as Prevention of AI-Based Attacks

DOI: 10.4018/IJSSCI.2021100104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The main purpose of the research is to provide the solution that allows digitally signing video and audio records, which is extremely important not only in the frame of copyright protection but also when it comes to understanding whether a video is fake. In the world where neural networks are used more and more often, it becomes easy to make a fake video using the face of some famous politician, maybe even the president of the US, and broadcast such a video, causing different kinds of negative events in the situation of the political crises.
Article Preview
Top

Introduction

Image recognition uses artificial intelligence technology to automatically identify objects, people, places and actions in images. Image recognition is used to perform tasks like labeling images with descriptive tags, searching for content in images, and guiding robots, autonomous vehicles, and driver assistance systems (Galiautdinov R., 2020a).

Image recognition is natural for humans and animals but is an extremely difficult task for computers to perform (Galiautdinov R., 2020b). Over the past two decades, the field of Computer Vision has emerged, and tools and technologies have been developed which can rise to the challenge.

The most effective tool found for the task for image recognition is a deep neural network (see our guide on artificial neural network concepts), specifically a Convolutional Neural Network (CNN). CNN is an architecture designed to efficiently process, correlate and understand the large amount of data in high-resolution images.

The human eye sees an image as a set of signals, interpreted by the brain’s visual cortex. The outcome is an experience of a scene, linked to objects and concepts that are retained in memory. Image recognition imitates this process. Computers ‘see’ an image as a set of vectors (color annotated polygons) or a raster (a canvas of pixels with discrete numerical values for colors).

In the process of neural network image recognition, the vector or raster encoding of the image is turned into constructs that depict physical objects and features. Computer vision systems can logically analyze these constructs, first by simplifying images and extracting the most important information, then by organizing data through feature extraction and classification.

Finally, computer vision systems use classification or other algorithms to make a decision about the image or part of it – which category they belong to, or how they can best be described.

One type of image recognition algorithm is an image classifier. It takes an image (or part of an image) as an input and predicts what the image contains. The output is a class label, such as dog, cat or table. The algorithm needs to be trained to learn and distinguish between classes.

In a simple case, to create a classification algorithm that can identify images with dogs, you’ll train a neural network with thousands of images of dogs, and thousands of images of backgrounds without dogs. The algorithm will learn to extract the features that identify a “dog” object and correctly classify images that contain dogs. While most image recognition algorithms are classifiers, other algorithms can be used to perform more complex activities. For example, a Recurrent Neural Network can be used to automatically write captions describing the content of an image.

Neural network image recognition algorithms rely on the quality of the dataset – the images used to train and test the model (Galiautdinov R., Mkrttchian V., 2019 A).

Once training images are prepared, you’ll need a system that can process them and use them to make a prediction on new, unknown images. That system is an artificial neural network. Neural network image recognition algorithms can classify just about anything, from text to images, audio files, and videos (see our in-depth article on classification and neural networks). Neural networks are an interconnected collection of nodes called neurons or perceptrons. Every neuron takes one piece of the input data, typically one pixel of the image, and applies a simple computation, called an activation function to generate a result (Galiautdinov R., Mkrttchian V., 2019 B). Each neuron has a numerical weight that affects its result. That result is fed to additional neural layers until at the end of the process the neural network generates a prediction for each input or pixel (Figure 1).

Figure 1.

Perceptron

IJSSCI.2021100104.f01

This process is repeated for a large number of images, and the network learns the most appropriate weights for each neuron which provide accurate predictions, in a process called backpropagation. Once a model is trained, it is applied to a new set of images which did not participate in training (a test or validation set), to test its accuracy. After some tuning, the model can be used to classify real-world images.

Traditional neural networks use a fully-connected architecture, as illustrated below, where every neuron in one layer connects to all the neurons in the next layer (Figure 2).

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 1 Issue (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing