Site icon Smart Again

How to kill a rogue AI

How to kill a rogue AI


It’s advice as old as tech support. If your computer is doing something you don’t like, try turning it off and then on again. When it comes to the growing concerns that a highly advanced artificial intelligence system could go so catastrophically rogue that it could cause a risk to society, or even humanity, it’s tempting to fall back on this sort of thinking. An AI is just a computer system designed by people. If it starts malfunctioning, can’t we just turn it off?

A new analysis from the Rand Corporation discusses three potential courses of action for responding to a “catastrophic loss of control” incident involving a rogue artificial intelligence agent. The three potential responses — designing a “hunter-killer” AI to destroy the rogue, shutting down parts of the global internet, or using a nuclear-initiated EMP attack to wipe out electronics — all have a mixed chance of success and carry significant risk of collateral damage. The takeaway of the study is that we are woefully unprepared for the worst-case-scenario AI risks and more planning and coordination is needed.

In the worst-case scenarios, probably not. This is not only because a highly advanced AI system could have a self-preservation instinct and resort to desperate measures to save itself. (Versions of Anthropic’s large language model Claude resorted to “blackmail” to preserve itself during pre-release testing.) It’s also because the rogue AI might be too widely distributed to turn off. Current models like Claude and ChatGPT already run across multiple data centers, not one computer in one location. If a hypothetical rogue AI wanted to prevent itself from being shut down, it would quickly copy itself across the servers it has access to, preventing hapless and slow-moving humans from pulling the plug.

Killing a rogue AI, in other words, might require killing the internet, or large parts of it. And that’s no small challenge.

This is the challenge that concerns Michael Vermeer, a senior scientist at the Rand Corporation, the California-based think tank once known for pioneering work on nuclear war strategy. Vermeer’s recent research has concerned the potential catastrophic risks from hyperintelligent AI and told Vox that when these scenarios are considered, “people throw out these wild options as viable possibilities” for how humans could respond without considering how effective they would be or whether they would create as many problems as they solve. “Could we actually do that?” he wondered.

In a recent paper, Vermeer considered three of the experts’ most frequently suggested options for responding to what he calls a “catastrophic loss-of-control AI incident.” He describes this as a rogue AI that has locked humans out of key security systems and created a situation “so threatening to government continuity and human wellbeing that the threat would necessitate extreme actions that might cause significant collateral damage.” Think of it as the digital equivalent of the Russians letting Moscow burn to defeat Napoleon’s invasion. In some of the more extreme scenarios Vermeer and his colleagues have imagined, it might be worth destroying a good chunk of the digital world to kill the rogue systems within it.

In (arguable) ascending order of potential collateral damage, these scenarios include deploying another specialized AI to counter the rogue AI; “shutting down” large portions of the internet; and detonating a nuclear bomb in space to create an electromagnetic pulse.

One doesn’t come away from the paper feeling particularly good about any of these options.

Option 1: Use an AI to kill the AI

Vermeer imagines creating “digital vermin,” self-modifying digital organisms that would colonize networks and compete with the rogue AI for computing resources. Another possibility is a so-called hunter-killer AI designed to disrupt and destroy the enemy program.

The obvious downside is that the new killer AI, if it’s advanced enough to have any hope of accomplishing its mission, might itself go rogue. Or the original rogue AI could exploit it for its own purposes. At the point where we’re actually considering options like this, we might be past the point of caring, but the potential for unintended consequences is high.

Humans don’t have a great track record of introducing one pest to wipe out another one. Think of the cane toads introduced to Australia in the 1930s that never actually did much to wipe out the beetles they were supposed to eat, but killed a lot of other species and continue to wreak environmental havoc to this day.

Still, the advantage of this strategy over the others is that it doesn’t require destroying actual human infrastructure.

Vermeer’s paper considers several options for shutting down large sections of the global internet to keep the AI from spreading. This could involve tampering with some of the basic systems that allow the internet to function. One of these is “border gateway protocols,” or BGP, the mechanism that allows information sharing between the many autonomous networks that make up the internet. A BGP error was what caused a massive Facebook outage in 2021. BGP could in theory be exploited to prevent networks from talking to each other and shut down swathes of the global internet, though the decentralized nature of the network would make this tricky and time-consuming to carry out.

There’s also the “domain name system” (DNS) that translates human-readable domain names like Vox.com into machine-readable IP addresses and relies on 13 globally distributed servers. If these servers were compromised, it could cut off access to websites for users around the world, and potentially to our rogue AI as well. Again, though, it would be difficult to take down all of the servers fast enough to prevent the AI from taking countermeasures.

The paper also considers the possibility of destroying the internet’s physical infrastructure, such as the undersea cables through which 97 percent of the world’s internet traffic travels. This has recently become a concern in the human-on-human national security world. Suspected cable sabotage has disrupted internet service on islands surrounding Taiwan and on islands in the Arctic.

But globally, there are simply too many cables and too many redundancies built in for a shutdown to be feasible. This is a good thing if you’re worried about World War III knocking out the global internet, but a bad thing if you’re dealing with an AI that threatens humanity.

Option 3: Death from above

In a 1962 test known as Starfish Prime, the US detonated a 1.45-megaton hydrogen bomb 250 miles above the Pacific Ocean. The explosion caused an electromagnetic pulse (EMP) so powerful that it knocked out streetlights and telephone service in Hawaii, more than 1,000 miles away. An EMP causes a surge of voltage powerful enough to fry a wide range of electronic devices. The potential effects in today’s far more electronic-dependent world would be much more dramatic than they were in the 1960s.

Some politicians, like former House Speaker Newt Gingrich, have spent years warning about the potential damage an EMP attack could cause. The topic was back in the news last year, thanks to US intelligence that Russia was developing a nuclear device to launch into space.

Vermeer’s paper imagines the US intentionally detonating warheads in space to cripple ground-based telecommunications, power, and computing infrastructure. It might take an estimated 50 to 100 detonations in total to cover the landmass of the US with a strong enough pulse to do the job.

This is the ultimate blunt tool where you’d want to be sure that the cure isn’t worse than the disease. The effects of an EMP on modern electronics — which might include surge-protection measures in their design or could be protected by buildings — aren’t well understood. And in the event that the AI survived, it would not be ideal for humans to have crippled their own power and communications systems. There’s also the alarming prospect that if other countries’ systems are affected, they might retaliate against what would, in effect, be a nuclear attack, no matter how altruistic its motivations.

Given how unappealing each of these courses of action is, Vermeer is concerned by the lack of planning he sees from governments around the world for these scenarios. He notes, however, that it’s only recently that AI models have become intelligent enough that policymakers have begun to take their risks seriously. He points to “smaller instances of loss of control of powerful systems that I think should make it clear to some decision makers that this is something that we need to prepare for.”

In an email to Vox, AI researcher Nate Soares, coauthor of the bestselling and nightmare inducing polemic, If Anyone Builds It, Everyone Dies, said he was “heartened to see elements of the national security apparatus beginning to engage with these thorny issues” and broadly agreed with the articles conclusions — though was even more skeptical about the feasibility of using AI as a tool to keep AI in check.

For his part, Vermeer believes an extinction-level AI catastrophe is a low-probability event, but that loss-of-control scenarios are likely enough that we should be prepared for them. The takeaway of the paper, as far as he is concerned, is that “in the extreme circumstance where there’s a globally distributed, malevolent AI, we are not prepared. We have only bad options left to us.”

Of course, we also have to consider the old military maxim that in any question of strategy, the enemy gets a vote. These scenarios all assume that humans were to retain basic operational control of government and military command and control systems in such a situation. As I recently reported for Vox, there are reasons to be concerned about AI’s introduction into our nuclear systems, but the AI actually launching a nuke is, for now at least, probably not one of them.

Still, we may not be the only ones planning ahead. If we know how bad the available options would be for us in this scenario, the AI will probably know that too.

This story was produced in partnership with Outrider Foundation and Journalism Funding Partners.



Source link

Exit mobile version