Is artificial intelligence altering contemporary music?

Künstliche Intelligenz
Artificial intelligence not only shapes our everyday lives, but it is also increasingly becoming a part of music. Learning machines and algorithms are now sometimes equal partners in compositions.

Nothing about a caterpillar betrays that it will metamorphosise into a butterfly – in the ’60s Richard Buckminster Fuller already knew about the equally profound and comprehensive effect of seemingly marginal changes. Inconspicuously and largely shrugged off by the global public, developments in artificial intelligence have been making enormous progress for some years now, and we humans are invariably the catalyst, or more precisely: our interactions.

The algorithms from Alphabet and Apple, Twitter, TikTok, Meta and Microsoft are fed by our search queries and by information about product use, conversations with Alexa or Siri, software backdoors, legal and illegal metadata collection and computer-linguistic methods for evaluating all of this. Language is always essential for the approach to uniqueness, i.e. to the moment that heralds the reproduction of human intelligence by machines in the AI business, now undergirded by the military.

What is more, the US Department of Energy (DOE), the Pentagon and the transhumanist microcosm of Silicon Valley are spending hundreds of billions of dollars in black budget programmes for research into neural networks, learning autonomous systems and robotics – goal: unknown.

The cultural technique of manipulated sounds, commonly referred to as music, does not remain unaffected by these developments. Artificial intelligence (AI) ensures that people are first recommended new artists on Spotify, then the same style grids and subgenres, until they finally regurgitate the algorithms’ fodder in their own taste bubble.

It is now also part of musical creation itself – whether in the form of learning software for vocal genesis, dynamic recording and modulation technology or automated black boxes for creating individual patches, entire songs, collage-like cover artwork or machine melodies. Not only the way we receive music, but increasingly also how we create and distribute it is guided by processes that more or less originate from artificial intelligence or learning systems – and not just since yesterday.

Machines make music themselves

As early as in 1960, the information technologist Rudolf Zaripov published the first scientific paper with the reference »On the algorithmic description of the process of music composition« in the Soviet journal »Доклады Академии Наук СССР, Doklady Akademii Nauk SSSR (DAN SSSR)« and used a first-generation »Ural-1« vacuum tube computer, and later its successors the »Ural-2« to »Ural-4«. Although the BBC had already made shaky recordings with a Ferranti Mark 1 computer and its interpretations of the traditional nursery rhyme »Baa Baa Black Sheep« and »In The Mood« by Glenn Miller in the early 1950s, these were examples of knowledge-based synthesis, not true autonomous generative music as we are only beginning to understand it today.

The definition of »AI music«, or »Algosound« is still as controversial today as the terms themselves. The early and without doubt historically significant sounds made by an IBM mainframe computer (Bell Labs, 1957) or the Australian CSIRAC computer (Pearcey, Beard, Hill, 1950) do not yet count as such. It is possible, however, that Ray Kurzweil’s performance of »I’ve Got a Secret« in 1965 (see video below) can be regarded as at least the first public moment in which a learning algorithmic system with pattern recognition composed its own melodies.

These were early attempts at a completely new kind of music that integrated machines as acting subjects and that continued to develop in parallel to the progress made by semiconductor technology. Nevertheless, for almost three decades, approaches to generative music, no matter how advanced, tended to be preserved for the electro-acoustic avant-garde and those tech geeks who could surmount the programming hurdles and handle the state of the art.

From Karlheinz Stockhausen to Pierre Boulez, from Roger B. Dannenberg to David Cope, from acousmatist Roland Kayn to ambientologist Brian Eno, it was scientists and eccentrics at the edges of the audible who saw machines less and less as tools rather than as successive actors emancipated from their creators. In 1997 the »Experiments In Musical Intelligence« (EMI, or Emmy) programme succeeded in convincing an entire audience that its composition had to be an original by Bach and sounded more like it than the piece by the human counterpart Dr Steve Larson.

In 2012, the London Symphony Orchestraon the album »Iamus« went so far as to have the music composed entirely by a computer cluster called Melomics which the orchestra then had to interpret. Brian Eno himself came up with a term for this acoustic accommodation of the human-machine interface in various interaction situations, which is still valid today but is slowly crumbling, when he experimented with the software Koan from 1995 onwards: he called it »Generative Music«.

Data instead of notes

With the turn of the millennium, the Internet took off – search engines and platform capitalism, algorithms for pattern recognition in all conceivable data clusters, the architecture of microprocessors as well as the fibre optic infrastructure, but also the macro processes of the social superstructure proceeded to perform capers. Artificial intelligence, or what was and still is called so from different sides, has since been changing faster than ever.

Producers of the most diverse genres are increasingly recognising the potential of generative music and are beginning to combine different methods of synthesis: stochastic models, knowledge-based systems fed with gigantic amounts of data, computer-linguistic grammars, evo-devo approaches and machine learning (also: deep learning), in which artificial neural networks develop a (still!) human-dependent dynamic in the composition process and acoustic reproduction – everything that is technologically feasible comes into use, both in isolation and in combination.

And what is feasible can sometimes be as exciting as it is disturbing. At one end of the spectrum are developments such as the Travis Bott software (see video above), which works with thousands of MIDI files, which in turn were extracted from melodies and beat sequences, but also bars and lyrics from Travis Scott, and condenses them in the track »Jack Park Canny Dope Man« into a kind of artistic cross total of Scott, with amazingly successful results.

Scott’s music, with its diatonic structure and simple sound sequences, is certainly predestined for AI imitations that are already very similar to the original, but this is only one of many beginnings in the technology’s development.

Where does the aesthetic potential lie?

In Japan, as is often the case, they were already a few steps ahead. With Miku Hatsune (»the first sound of the future«), Crypton Future Media released an artificial singing voice in 2007, based on the software synthesiser Vocaloid2, which rose to become a virtual pop icon within a few years and is still marketed today just like its human counterparts in the highly commercialised Japanese pop business. A particularly palatable difference for Japan’s major labels: in less than six years, Hatsune sang over 100,000 songs – more than all other Japanese artists put together. A wet dream for all performing rights societies because her songs are well received. She regularly reaches the Japanese Oricon charts, wins awards or is even booked for Western TV formats (»Late Show with David Letterman«) and major festivals (Coachella).

At the other end, and musically completely different in nature but similarly groundbreaking, are systems such as the deep learning algorithm AIVA, created in 2016. AIVA (Artificial Intelligence Virtual Artist) analyses works by Bach, Beethoven, Mozart and others and derives its own compositions from them. To date, AIVA is the first software to be recognised as a composer in its own right by a performing rights society, SACEM in France.

Do these kinds of systems made up of databases, code and PR departments threaten to outstrip human pop stars in the future? Some think that is already the case. Canadian pop futurist Grimes, speaking on the Mindscape podcast in 2019, predicted the future extinction of human artists: »Sooner or later, AI is going to emulate all of our hormones, our feelings, our emotions, and understand what great art and true innovation for us is. Probably even better than we do.«

»Früher oder später wird KI all unsere Hormone, Gefühle, emotionalen Regungen emulieren und verstehen, was für uns große Kunst und wahre Innovation ist. Wahrscheinlich sogar besser als wir.«


That Elon Musk’s former liaison anticipates such a dramatic and technophilic future is hardly surprising. Artists like Holly Herndon see another development as more dominant, one that sees artificial intelligences, or learning systems, more like new tools or perhaps even new band members and session musicians. In 2019, she and her husband Mat Dryhurst released the album »Proto«, which featured SPAWN, a neural network they both designed. SPAWN was fed for weeks and months with a multitude of voices of varying ranges and timbres, trained by soloists as well as choirs, and reproduced from them a voice of its own, or rather a repertoire of vocal ranges that can underpin as well as expand, accentuate and also contrast Herndon’s singing.

»The really interesting question is: what aspect of this technology can show us new aesthetics, new potentials, from which we can in turn learn? Instead of simply scraping together the history of human music and creating a generic intersection – that would be a rather boring approach to working with a neural network,« she said at the time.

AI is what it eats

Herndon therefore focuses on a voice synthesis that is as human as possible, creating its own voice from heard and learned voices. In recent years, however, algorithms have also become more specifically trained to recognise and reproduce beat patterns, chord progressions, sampling, sequencing and timbres. In 2018, Sean Booth and Rob Brown aka Autechreequipped their Max/MSP algorithm with numerous parameters, let it jam and recognise patterns for days and then bolted together the eight hours of material from the »NTS Sessions« live in exchange with the machine – possibly one of their most ambitious projects to date.

Four years later, the results are nothing less than breathtaking and demonstrate vividly how powerful the tool of an intelligent, learning system can be in the right hands. The British sound sculptor Darren J. Cunningham aka Actress has also been working with his own algorithm since 2018, which he christened »Young Paint« and let dive into strangely futuristic waters of outsider house, dub techno and generative music on his eponymous debut EP.

Young Paint has already collaborated with LA-based artist K Á R Y Y N on her elegiac »Quanta« series and Cunningham has been constantly developing it ever since. Meanwhile, Brits James Ginzburg and Paul Purgas released the album »Blossoms« (2019) under the name Emptyset. Also made in cooperation with a software program, it shows how ice-cold and disturbing this entanglement of technological hybrids and algorithmic mutations can turn out to be when the respective system is fed with the appropriate material – AI is what it eats.

The »A Late Anthology Of Early Music Vol. 1: Ancient To Renaissance« by the Irish improvisation artist Jennifer Walshe, released in 2020, sounds similarly disturbing. In cooperation with the US-American duo Dadabots, she applied a neural network to the linguistic training of her a cappella recordings – over 40 (simulated) generations or 1200 years. Does this mean that artificial intelligence will also transform the state, the tonal starting point of classical music in the years to come?

Even outside the Western musical canon, artists are setting out to seek new forms of expression with the tools provided by learning systems. Just last year, for example, the Ugandan trio Metal Preyers released the mix-tape »432+«, an unconventional take on generative, post-industrial experimentation even in this field, drawing on ambient dub, field recordings, tribal electro-acoustics, R&B and noise. Although learning systems were only used to a very limited degree here, that in itself is apparently enough to create completely different soundscapes that have that special, robotic touch to them. Where »AI music« begins and where it ends, however, remains difficult to define.

One thing is certain, however: the developments are in their infancy and hardly any of the approaches currently being pursued by artists worldwide actually integrate systems that we can consider »intelligent« in a classical human sense – mainly because they simply do not exist yet. But maybe criticism of our very limited definition of »intelligence« is also appropriate in this context, in order to understand at the end of the beginning what we actually mean by it – and what is in store for us.

Künstliche Intelligenz

A selection of records with music based on artificial intelligence

This way to the webshop