Music AI, a brief history
AI generated music is a field that’s evolved a lot, especially the last few years. In this post we’ll be looking at where it all started, where it went from there, and what you can expect from musical AI right now.
It could be argued that the first procedurally generated piece of music was created in the early 1950s, where composer John Cage used the ancient Chinese divination text I Ching to generate instruction sets for music. The first proper AI music composition was however the Illiac Suite in 1956. Programmers Lejaren Hiller and Leonard Isaacson used the ILLIAC computer, the first ever computer to be built, to create a score that was later played by humans. Hiller would define a set of rules, for example only using notes within an octave and harmonies that tended towards the major and the minor with no dissonance, along with some other rules. Then, by providing some input, the computer would generate a score. This is one of the resulting scores played by humans.
The 1980s was where a lot of fundamental breakdowns in the field were made. Among others, the composer David Cope developed a general algorithm that focused on the main points: deconstructing and analyzing input music, looking for commonalities and signatures that signifies style, and recombining musical elements into new works. This work is used as the foundation of many future AI models.
In 2002, a paper written by François Pachet proposed the program called The Continuator. A machine learning model that would listen to musicians play, and then continue playing, following the original style. This is one of the first programs to use ML in real-time to generate original music, and it marked a significant milestone in the field.
In the years between 2010 and now, developments pick up in speed and several advancements are made. Iamus is a computer cluster that composes classical music in it’s own style. It is one of the first models that uses evolutionary algorithms with supervised learning to generate music.
OpenAI’s Jukebox is a recent model, released in 2020, that uses neural networks to generate raw audio data. This is significant as most models before this only generated MIDI data or musical scores.
Looking at the current state of the art, we find a lot of different programs fulfilling numerous roles within the field of musical AI, but one of the most notable ones is Google’s MusicLM. This is a machine learning model that can process text-based prompts and generate music based on the prompts. This model features impressive accuracy, and the ability to parse prompts detailing instruments used, purpose of the music, feelings involved, experience level of the player, and more.