Now AI is coming for musicians

Do you remember those far off misty days of yore, when shocking, startling, amazing, disquieting revelations from the world of Artificial Intelligence only arrived every year or two, or even longer? It was about, ooh, a fortnight ago: a wistful, innocent time of smiling boy scouts, and honey for tea, and vicars in bicycle clips, and all we had to worry about was this funny new thing called GPT3.

For about an hour after making that ditty I had that chorus ‘You can’t even hijack planes’ spinning in my mind

Since then, things have, to say the least, accelerated. We’ve had ChatGPT, and GPT4, and Google’s Bard, and Google’s risibly woke Gemini, and France’s Mistral, and Google DeepMind solving profound scientific mysteries, and weird robot dogs, and quasi-autonomous weapon systems, and androids that can do dishes, and deepfake images of Donald Trump mud wrestling Joe Biden, and a New York Times quiz which, perplexingly, showed that not only can people no longer distinguish between fake AI photos of faces and photos of real faces, people seem to actively prefer the fake faces, they find AI’s fake reality more real than reality, a discovery which has such disturbing implications I am going to swiftly move on and try not to think about it.

Alongside these epochal developments, anyone in the arts will likewise have noticed that AI is coming for everyone in the arts. Photographers have been unnerved and depressed by the superb image-creation skills of models like Dall-E and Stable Diffusion, and, especially, Midjourney (which made some of those images for the NYT). Just last month OpenAI, the company at the bleeding edge of this revolutionary tech, announced Sora, a ‘diffusion’ model which creates video scenes (of hallucinatory brilliance and sometimes great beauty) out of verbal prompts. This has freaked out everyone in TV, advertising and Hollywood. And for good reason: why pay for actors, sets, designers, locations, and a star who goes on coke binges, when you can sit at home and press a button marked ‘make a movie’.

Nor is it just the visual arts that are imperilled. AI is also coming for us wordsmiths, and in short order. A year ago I wrote in the Spec that ‘AI is the end of writing’, and at the time the commentary on my piece was sceptical if not derisory; now the scepticism and derision are draining away. Verbal AI is generating academic texts, essays, fan fiction, romantic stories and advertising copy.

Soon it will move on to novels, biography, drama. Industry insiders estimate AI will be able to generate plausible soap opera scripts in three years, and AI won’t go on strike, unlike the Writers’ Guild of America. There is, in the end, no reason why AI won’t write ‘great’ literature, humans do not have a special creative sauce unique to Homo sapiens, unless you believe God came down to one planet and chose one bipedal ape and said ‘let him make sonnets’.

Most popular

Ben Goldsmith

Pablo Escobar’s hippos are saving Colombia’s wetlands

All of the arts, then, are under siege. And one of the few artistic realms left unmolested to date seems to be music: perhaps the noblest of the arts, certainly the most indefinable and mysterious, the most preciously human. As Shakespeare said, ‘If music be the food of love, play on’. Trouble is, it now looks like it will be the machines playing on, not humanity.

Let me introduce you to Suno, a music-making AI model. It is far from perfect, in many ways it is primitive, clumsy, poor – but then so was image-making AI about 18 months ago. There is no obvious reason why in three or six years Suno will not be making superb music.

Right now, however, Suno is best seen as an illuminating toy (like GPT3 in its day). It works like this: you can type in your lyrics and then request a musical style – opera, neo-folk, country, death metal, rock anthem. Then you give it a title and press ‘create’ and it creates your song.

As I am writing in The Spectator, to source my lyrics I went straight to the top, the boss, the editor, and a searing piece Fraser Nelson wrote about Lee Anderson, the Tory MP who this week switched to Reform. I snipped out a passage to make the lyrics, slightly compressed them, called the song ‘Licensed Rottweiler’ and asked the computer to compose it in a bluegrass style (because I like bluegrass). It did an OK job, listen to it here.

To display the versatility of the model I then asked Suno to make a tune of ‘Licensed Rottweiler’ in a different, ‘Vivaldi-ish’ operatic style, with hints of aria. Here it is.

You can, of course, hear the problems, flaws and glitches. Words are garbled, or missed entirely, weird pauses intrude, and the songs often don’t know when to end. And yet, this technology is also properly amusing. As a final test I went to Twitter and found a particularly mad tweet by a pro-Palestinian activist, who appeared to be lamenting the loss of the good old days when you could freely hijack planes and fly them into towers. This is the tweet”

You can’t protest peacefully. You can’t boycott. You can’t hunger strike. You can’t hijack planes. You can’t block traffic. You can’t throw Molotovs. You can’t self-immolate. You can’t heckle politicians. You can’t march. You can’t riot. You can’t dissent. You just can’t be.

This tweet seemed made for music, so I set to work. For Suno I slightly adapted the words (I don’t think the AI liked the overt hints of violence), then I asked for an epic rock ballad, called ‘You Can’t Even Hijack Planes’, and it delivered. Spookily, it absolutely nailed the chorus. See what you think.

For about an hour after making that ditty I had that chorus ‘You can’t even hijack planes’ spinning in my mind. I also wanted to hear more of the song. If Suno could only produce a proper finished version, I’d probably download it.

So that’s where we are with AI and music. This AI music is, as I say, about two years behind the AI image making and essay generating, but they are all on the exponential curve pointing up. So if you fancy being a human musician in the future, make sure you are really good at live performances. Because that is, probably, all that will be left.