Could AI enslave humanity before it destroys it entirely?

Depending on how you count, we are in the midst of the second or third AI hype-bubble since the 1960s, but the absolute current state of the art in machine cognition is still just about being better than humans at playing chess or being about as good as human beings at analysing some medical scans. It was recently revealed that many thousands of humans were secretly hired to check recordings of people interacting with the ‘intelligent assistants’ on their iPhones or other such devices: much of what is trumpeted as ‘AI’ is still, in fact, dependent on invisible human labour in the digital sweatshop.

Given all this, and the plain threats the world faces from natural stupidity, how worried should we be about some future Skynet-like AI taking over the world and enslaving or destroying humanity? Well, some brilliant people are very worried indeed: they include the philosopher Nick Bostrom, whose 2014 book Super-intelligence popularised the modern version of the problem, and the physicist Max Tegmark, whose 2017 book Life 3.0 I highly recommend for an overview of the story so far.

Most popular

Madeline Grant

The monumental self-delusion of Rachel Reeves

The science writer Tom Chivers, in The AI Does Not Hate You, delivers a pleasantly journalistic if rather dishevelled account of how such personalities figured in the rise of an AI-focused internet community called the Rationalists. Are they just another millenarian movement of tech-inflected eschatology, or alternatively, as one sceptic says, a ‘sex cult’? Even if they are, we might need to take them seriously, as we should take seriously the possibility of a big asteroid hitting the Earth again: it might be unlikely, but the downside is an extinction-level event.

Not among the cultists is Stuart Russell, author of a seminal textbook on AI computing, whose own new book is a more nutritiously reasoned and orderly investigation. Russell is unusual among computer scientists in that he has read widely in literature and philosophy, and so his arguments are richer and more detailed (as well as funnier) than those of your average Silicon Valley guru who blithely assumes that ‘tech’ can solve problems that have puzzled mere human geniuses for millennia.

Much of Russell’s book, indeed, is a brilliantly clear and fascinating exposition of the history of computing thus far, and how very difficult true AI will be to build. It has, though, historically always been a bad bet to say that some scientific or technological revolution (e.g. flying machines) is simply impossible, so Russell turns his attention to a projected future of real AI, which he guesses might happen in around 80 years. What then? Why, then we face the subtitular ‘problem of control’. Given a machine that, ex hypothesi, is far more intelligent than a human, how do we make sure it acts only in ways beneficial to us?

Take the notorious paperclip problem. You tell the AI to make paperclips. What could be more innocent? Unfortunately you haven’t told the AI when to stop making paperclips, so it carries on until it has turned all the available atoms on Earth, including all the human beings, into paperclips, and it doesn’t stop then. Even more unfortunately, you can’t turn it off once you realise your mistake, because the AI has, quite logically, decided that in order to fulfil its given goal of making paperclips, it must protect itself by disabling its off switch, otherwise it wouldn’t be able to carry on making paperclips.

This example generalises, in Russell’s colourful yet rigorous exegesis, to the lesson that it is impossible fully to specify any goal we give a machine to prevent catastrophic misunderstandings. His proposed solution, to simplify a lot of nuance, is to avoid giving the machine ‘goals’ at all, but to build into it a kind of epistemic humility: it should infer what to do from observing the preferences of humans. Of course some humans will have evil preferences, but that sounds like a start, as does his lovely passing endorsement of ‘researchers with good taste’, and his recommendation that we all should have a right to mental security, to ‘live in a largely true information environment’.

Meanwhile we can fondly hope that the current AI hype will at least bear fruit in a consumer version of the Berkeley Robot for the Elimination of Tedious Tasks, which, Russell tells us, ‘has been folding piles of towels since 2011’. Another AI researcher tells Chivers, indeed, that he thinks the whole threat of machines ever doing much worse than folding towels for us is overblown: to him, ‘the basic premise of an intelligent thing destroying the world was silly, and not really in keeping with what “intelligence” means’. Hmmm. Look out the window: how is the only example with which we are familiar of intelligence with a capacity to alter the world doing so far?