How do we stop artificial intelligence from going rogue? The idea is scary enough that robotics companies are proposing that the UN put a ban on killer autonomous robots. And it's made even scarier by mounting evidence that engineers actually understand very little about how AI algorithms do what they do.
Doomsday singularity scenarios aside, rogue AI presents a very serious problem even in more everyday terms. What if the autonomous cars chauffeuring us around reach the wrong conclusions on how they should operate in traffic? What if factory robots decide they have a better way of going about their tasks – but it ruins workflow? While it's not the same as Skynet building Terminators and launching nuclear missiles, the negative impact on safety and productivity brought about by rogue AI could have devastating consequences all its own.
Researchers at the Swiss Federal Institute of Technology in Lausanne (EPFL) have been looking at this problem and have designed a failsafe mechanism that will give humans the last word on what AI does and doesn't do. It's all about teaching AI that it shouldn't learn when it is interrupted or shutdown in order for a correction to be made.
The latest research from EPFL, “Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning,” which was presented at the 2017 Conference on Neural Information Processing Systems (NIPS 2017) in Long Beach, Calif., last December, builds upon previous work by engineers at companies like Google to design a sort of “big red button” for AI that will stop it if it ever gets out of control.
In 2016 researchers from Google DeepMind (the same subsidiary behind Google's AlphaGo AI) and the Future of Humanity Institute in Oxford, UK, tackled this issue by looking at individual AI. In a paper titled, “Safely Interruptible Agents” the researchers looked at the inherent problem of interrupting an AI during a reinforcement learning process.
Reinforcement learning for AI is not unlike how you might train a dog. You set the AI at a task, have it do it over and over again, and reward it for getting better at it each time. It's one of the fundamental means of training AI. While machines themselves that learn via reinforcement learning are not very common, because having a machine learn from scratch on the job would wreak havoc on workplace productivity, reinforcement learning is often used to pre-train AI before it is installed in a machine.
The issue inherent in many reinforcement learning algorithms is that the AI will treat any interruption as part of the learning process. Every time you stop the AI it will begin to pick up on the pattern of the interruptions and start adjusting its behavior to get around and avoid them. On one hand this could be a good sign of learning. But it can also have dire consequences if the interruptions create a bias in the AI that keeps it from performing its core task.
The Google researchers found that, while some algorithms are more equipped for it than others, it is possible to build in ”safe interruptibility” mechanisms that tell the AI not to learn from interruptions. “Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for this,” the paper says.