Real-Time Big Data Processing Based on a Distributed Computing Mechanism in a Single Server

Real-Time Big Data Processing Based on a Distributed Computing Mechanism in a Single Server

DOI: 10.4018/978-1-6684-7679-6.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Today, the problem of processing big data in real time is observed not only in unstructured big data but also in dealing with structured data in databases of small businesses and organizations due to the rapid increase in data volume. Traditional methods and approaches are not considered effective to solve the problem. Moreover, most of the modern effective approaches are based on the cooperation of several computers, and they require plenty of expenses, so it is not suitable for small organizations. The approach proposed in this chapter aims to effectively process big data in real time, bypassing the shortcomings above. The proposed approach is based on the use of a distributed computing mechanism on a single server. The chapter reveals the architecture of this approach, the functional scheme, the essence of the approach, and the effectiveness of the approach. Moreover, in the chapter improving the effectiveness of the approach through machine learning is discussed. Experimental results have been obtained based on the approach and they compared with the traditional approach.
Chapter Preview
Top

Introduction

Today, many processes in people's daily life and work are being rapidly digitized while various social network platforms and blogs are emerging. Furthermore, various devices and mechanisms that collect data on them and the situation around them are being fitted in plenty of buildings and areas. Moreover, there is a high growth in the level of use of hand-held digital devices, wearable devices, and the Internet. As a consequence of them, a huge volume of various data flow is being appeared. This influx of data is creating unprecedented amounts of data on large and small computing systems and personal devices that are difficult to process with current traditional methods. Although this data creates a number of problems related to their storage and processing, their processing can be the basis for a complete study of the reasons for the processes that have happened, are happening, and will happen around us. And This leads to the solution of the problems that have not been solved in all fields and may be faced in the future. Therefore, today, the research of this large data stream, called Big Data, and its application to real life has become one of the current urgent research topics of today’s world.

It should be noted that today Big Data characteristics are not only in large international social networks, YouTube, Facebook, email, WhatsApp, google, amazon, AppStore, Instagram, telegram, or internet things or sensors or tracking devices that process various signals, but also in the database of small businesses and organizations. Usually, in many small businesses and organizations, computers with relatively small capabilities are widely used due to economic shortages and a lack of qualified workers. A faster-than-usual increase in the amount of data in the database on these computing machines leads to a sharp increase in the processing time of these database management systems. The fact that the existing information systems and the methods used in them are ineffective in processing large amounts of data while it put the problem of real-time processing of these data in the existing information systems.

A large number of scientific research are being carried out by a great many world researchers on storing and processing Big Data for the above-mentioned large social networks and similar information systems, and as a result of the research, several effective solutions are being presented. But these approaches are not the most perfect solution for small organizations and corporations to use for storing and processing large amounts of data. Because most of these approaches are based on distributed computing mechanisms that include a number of servers that do not have a single shared memory and work together, which is unavailable for the finance of many organizations and enterprises. Of course, a distributed computing system that works in cooperation with several servers has high efficiency in processing Big Data, but at the same time, it has the following disadvantages, which require advanced specialist knowledge and technical capabilities (Tuperekiye & Enebraye, 2015):

  • Storage and infrastructure requirements increase as storage in different locations requires multiple copies of data, which in turn, requires a large amount of memory;

  • Increased security threats due to data being stored on multiple servers in multiple locations;

  • Data integrity is difficult to control;

  • Database design is complicated by the characteristics of different servers and their interconnections.

Due to the above drawbacks and the fact that the use of several servers for small organizations and corporations has caused an increase in costs, it needs to find other effective solutions for real-time data processing. It shows the relevance of the issue. Usually, such a huge volume of data in small businesses and organizations belongs to the Fast Data class of Big Data (Bogomolov & Nevejin, 2017). As Fast Data is one of the most actively used data sets in the world, improving the efficiency of real-time processing of this data is one of the current research themes.

Complete Chapter List

Search this Book:
Reset