A Scoping Review of Current Developments in the Field of Machine Learning and Artificial Intelligence

A Scoping Review of Current Developments in the Field of Machine Learning and Artificial Intelligence

Copyright: © 2023 |Pages: 27
DOI: 10.4018/978-1-6684-8582-8.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter gives a broad outline of machine learning and artificial intelligence, and introduces the reader to many novel and latest developments in the field of machine learning. The first half of this compilation provides a comprehensive view of the classical concepts of machine learning. Subsequently, examples of machine learning frameworks are discussed. Deep learning, concepts, models, types, and algorithms in machine learning are elaborated in the subsequent section, followed by a detailed introduction to neural networks, concepts of weights, propagation, and initialization. The final section of this chapter introduces the reader to the fascinating and latest world of cutting-edge applications of machine learning like convolutional neural networks (CNNs), bidirectional long short-term memory (BLSTM), artistic, image-generating AI engines like DALL-E and stable diffusion, music and drama writing AI engines, human-like chatbot ChatGPT, art generation with AI, generative neural network concepts, regenerative neural network, and natural language processing (NLP).
Chapter Preview
Top

Big Data

“Big data” means that large datasets that are usually, automatically generated and that cannot be processed using traditional data management software like Microsoft Excel or Access. This term was popularized by John Mashey in 1900s. Essentially, parallel computation and large storage media are needed to process big data. Special software like Apache Hadoop, Java based Hive, Cloudera, MemSQL, Apache Spark, Amazon S3 can handle the volume of big data. This is due to the sheer size of the data produced, that storage and manipulation is cumbersome. For example, it is estimated that Google processes about 15 Exabyte of data per day. It is staggering to note that 15 Exabyte is equal to 15,000,000,000 (109) Gigabytes, for comparison consider a standard pen drive has 64 Gigabyte of data storage capacity, and a large hard disk has an average capacity of 1 terabyte, 15 Exabyte of data would need around 15,000,000 hard disks. Interestingly, it is estimated that only 0.5% of generated data is used for some form of data analysis(Ankur Saxena et al., 2021).

It is amazing to note that within the span of a minute, 575,000 tweets are posted, 240,000 photos are shared on Facebook, 694,000 hours of video are streamed on the YouTube, 5.7 million searches are conducted on Google search, 283,000$ worth of shopping is done by customers on Amazon, and 856 minutes of webinar are hosted on Zoom meeting platform. Such vast quantity of data is processed online per minute (Fontichiaro, 2018).

Distributed File Systems (DFS) are used to store big data where large data is divided into chunks, which are the stored in pieces across multiple devices or nodes connected by a network. Examples are MongoDB, Apache Cassandra, and ElasticSearch, Apache Spark, Apache Hadoop, and Amazon S3 (based on Amazon Web Service).

Big data can be curated and vetted structured data, or it can be raw, dirty unstructured data. This data can in form of various data types like numbers, images, text, audio-files, and videos. Data can be structured or unstructured. Structured data, curated and annotated Number, Dates and String, constitute only about 20% of big data. These can be processed by Relational database like SQL. Unstructured data image, audio, video, excel sheets, emails, word documents, constitutes a major fraction of about 80% of big data. These cannot be directly processed by Relational database like SQL and need cleansing, conversion, and pre-processing.

Complete Chapter List

Search this Book:
Reset