• Wayne Cheng

Using Machine Learning and Artificial Neural Networks to Create Music

Updated: Feb 6



The growing interest in artificial intelligence is a result of the recent technological advancements in machine learning. Many exciting solutions are on the horizon including self driving cars, automated medical diagnosis, and creating new works of art.


Machine learning works by feeding data into a machine, and then using the machine to generate new data. By building a learning machine with artificial neural networks (ANN), the machine is able to learn from large and complex datasets. ANN are modeled after the biological neural networks of the brain.


Until recently, machine learning algorithms were only capable of generating small datasets. The invention of Generative Adversarial Networks (GAN) and Recurrent Neural Networks (RNN) allow machines to generate large datasets. This enables the development of technologies to aid in the creative process, such as creating tools to automate the process of music production.


We limit the scope of the machine to generate a song in the form of music notation. In essence, the machine will provide tools to aid in the process of songwriting.


Design Choices for Automatic Songwriting


To determine the design choices for automatic songwriting, the outputs of the machine should have practical applications. Such applications include :


  • Generate music based on the public's perception of high-value music

  • Generate music from scratch, or in the context of pre-written material

  • Generate melodies given a set of lyrics, or given a set of chords



Training Dataset


There are two phases for machine learning algorithms : training and evaluation. During the training phase, data is fed into the machine to “train” the machine. During the evaluation phase, the machine is used to generate new data, based on its previous training.


To build a machine that generates high-value music, the training dataset should match the public's perception of high-value music. A song's value can be quantified with the historical data of the public's reception of that song. The data we use is the Billboard Hot 100, which is a weekly chart that ranks a song based on purchase, download, and streaming data.


The accuracy of machine learning algorithms is dependent on the accuracy of the training data. For any given song, the most accurate data is contained within the sheet music from the song's publisher. Not all songs on the Billboard Hot 100 have sheet music, but we were able to access over 5,000 songs to generate our dataset.


The sheet music is transformed into a "lead sheet" format to extract the essential elements out of a song. By using this format, the dataset is kept within a reasonable size, and exhibits consistency between songs. The expected output of the machine is in the lead sheet format, which can then be transformed into sheet music through orchestration.


In our research, we run experiments on various ways to transform the data. In order to determine the optimal data transformation, there needs to be a method to evaluate the accuracy of the machine.



Evaluating Accuracy


The accuracy of machine learning algorithms is determined by the difference between the predicted results and the actual results.


To measure accuracy, first, the dataset is split into two sections : training and test data. In the training phase, the machine is trained with data from the training set. In the evaluation phase, the machine's accuracy is evaluated with data from the test set. Thus, the accuracy is determined by how well the machine makes predictions on new data.


In our research, we run experiments to determine what results can be accurately predicted by the machine. For example, can the machine distinguish between a song and a randomly generated sequence of notes? Can the machine predict how well a song will rank on the Billboard Hot 100?


We also work backwards from the results, to determine which features of a song have the most impact on accuracy. For example, how much does rhythm or pitch contribute to accuracy? Which sections of a song contribute to accuracy?





Building Songwriting Applications


Once the accuracy of the machine is sufficient, the machine can then be used to generate new music. The accuracy of the machine determines the quality of the results. For example, if a machine is 90% accurate, then we are 90% confident that the machine can generate the expected results.


In our research, we try build applications that are helpful for songwriters. The songwriter can customize the parameters of the machine to generate music that meets expectations. Depending on the parameters, the application chooses the appropriate neural network model and training data to create new music.



Thank you for reading. Questions or comments? You can reach me at info@audoir.com



Wayne Cheng is an A.I., machine learning, and deep learning developer at Audoir, LLC. His research involves the use of artificial neural networks to create music. Prior to starting Audoir, LLC, he worked as an engineer in various Silicon Valley startups. He has an M.S.E.E. degree from UC Davis, and a Music Technology degree from Foothill College.

Copyright © 2020 Audoir, LLC

All rights reserved