Long Short-Term Memory-Based Neural Networks in an AI Music Generation Platform

Long Short-Term Memory-Based Neural Networks in an AI Music Generation Platform

Suresh Kumar Nagarajan, Geetha Narasimhan, Ankit Mishra, Rishabh Kumar
DOI: 10.4018/978-1-6684-6001-6.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Music is an essential component of a promotional video since it helps to establish a brand's or entity's identity. Music composition and production, on the other hand, is quite costly. The expense of engaging a competent team capable of creating distinctive music for your firm could be prohibitively expensive. In the last decade, artificial intelligence has accomplished feats previously unimaginable to humanity. Artificial intelligence can be a lifesaver, not only in terms of the amount of money a company would have to spend on creating their own unique music but also in terms of the amount of time and work required on the firm's part. A web-based platform that can be accessed from anywhere in the world would help the product obtain customers without regard to geography. AI algorithms can be taught to recognize which sound combinations produce a pleasing melody (or music). Multiple machine learning algorithms can be used to accomplish this.
Chapter Preview
Top

Introduction

Music theory is a term used by artists and academics to describe what they heard in a musical work. The fundamentals of music theory will help us better understand which sounds join together to create lovely songs. Melody, Harmony, and Rhythm are the most essential terms encountered when learning Music Theory. The emphasis, however, would be on the fundamentals of music theory. Scales, Chords, Keys (or Notes), and Notation are the rudiments of music theory. Notes (or Keys) and Chords would be our primary focus. Let’s take the example of a Piano, a piano has something we call octaves, every octave has a set of 7 Musical Keys - C, D, E, F, G, A & B. Between every note, there is another Note, these notes are- C#, D#, E#, F#, G#, A# & B# Every single note has a distinct sound. When multiple notes are played together in harmony, it is called a Chord. Chords can alternatively be defined as individual units of harmony.

A combination of chords and/or notes, which has a certain degree of Melody, Harmony, and Rhythm is known as music. We aim to extract sequences of chords and notes, which sound harmonious. Neural Network can help achieve a Machine Learning model, capable of the task of comprehending the music and producing music. Neural Networks can be understood with a human example, when shown a drawing of a cat, even though it might be just a doodle, we can almost instantly recognize it as a cat. This is because we focused on the “features” of the cat. The features of the cat include it having a couple of eyes, four legs, a couple of ears with a typical shape, and so on. We cannot always be certain that an object is what the ML Algorithm predicted; thus, we assign a probability to the same. Thus, every feature of the cat is given a probability, which is later on converted into a percentage. These features however aren’t all equally important. After we find the relative importance of these features in terms of percentage, we call it weight. Now, we multiply the percentage of features with the probability of each of the features to get a value. We apply the sigmoid function on this obtained value to get the value of the probability of the object being a cat.

A neuron is a single unit or function that takes inputs, multiplies them, then adds them together and applies the activation function to the whole. The purpose of a Neuron is to modify the weight depending on a large number of input and output samples. Based on inputs and desired outputs, Neuron learns about weights. Neural Network is a network of such neurons, neurons might have their input as the output of another neuron or vice versa, and this makes a neural network. As human beings, we don’t think the same thing again and again. Rather we use our previous knowledge and try to understand it. Traditional neural networks can’t do this; they can’t refer to the previous information. So, to address this issue recurrent neural networks come into play where the information forms a network of loops so the same information is accessible again. There are times when we need more information to understand things like predicting the last word of a sentence, while we’re typing a sentence, naturally, it’s not sure what it would be without more context. So that is when the difference between the actual word required and the point where it is required become more.

Long Short-Term Memory Network is a different type of RNN capable of learning those long gaps (Gers et al., 2000). LSTM is a chain-like structure but with a different structure. It has four neural networks interacting in a special way rather than a single neural network. Long Short-Term Memory Network can remember patterns of chords that sound harmonious. This in turn helps generate music, the Machine Learning Algorithm would be integrated with the Web Platform. The Website starts by helping the users understand the purpose of the application and help users navigate to a page, where selecting their preferable instrument and certain associated configurations, the configurations are fed into the Machine Learning Model, which then churns out the music using those parameters. The Machine Learning Model would already have been trained and deployed on a cloud platform. Music is generated and is turned to the user, who can then listen to it and decide whether they want to purchase the same.

Complete Chapter List

Search this Book:
Reset