Two new projects from Google are leading the way toward AI that can learn on its own and even create its own algorithms.
With only a little over a year having passed since Google's AI AlphaGo beat world champions at Go, it is no longer the smartest AI in classroom. Google has developed a new AI, AlphaGO Zero, that is better at Go than its predecessor (beating it 100-0 in head-on matches). But what's truly impressive is that it did it all by teaching itself.
In a paper published in the October 2017 issue of the journal Nature, researchers from Google DeepMind discussed how they created an AI capable of learning without human supervision. “Much progress towards artificial intelligence has been made using supervised learning systems that are trained to replicate the decisions of human experts,” the study reads. “However, expert data sets are often expensive, unreliable, or simply unavailable. Even when reliable data sets are available, they may impose a ceiling on the performance of systems trained in this manner.”
The Student Becomes its Own Teacher
The previous version of AlphaGo worked using a combination of neural networks that were trained to play Go by human experts in conjunction with reinforcement learning algorithms wherein the AI got better at the game by playing itself. Using both of these methods AlphaGo was able to use tree search – combing a “policy” neural network to select the next move and a “value” neural network to predict the outcome of each move – to quickly evaluate each possible move in a game scenario for the best possible outcome.
AlphaGo Zero takes the humans out of the equation, introducing an algorithm that is completely self-taught using only reinforcement learning. The researchers told the algorithm what the rules of the game were and let it learn by playing itself, without any human supervision. As it plays itself the Alpha Zero neural network gets a bit better each time. Each progressively smart version is then sent back to play itself so it is constantly learning from the best version of itself for the level of play it is at.
AlphaGo Zero differs from its predecessors in other key ways, as well. Where previous version used a small number of hand-engineered features as input, AlphaGo Zero relies only on what it has available – the black and white stones of the Go board. It also combines the value and policy neural networks used by previous AlphaGo versions into a single network. And it does not use “rollouts,” or random games generated by other Go programs, to help it predict winning moves.
During its learning process researchers found that the AI was not only teaching itself the basics of the game it was also discovering “non-standard strategies beyond the scope of traditional Go knowledge.” Go players, like chess players, are ranked according to what is called the Elo scale, a mathematical system meant to calculate the relative skill of players. The difference in players' Elo ratings is meant to serve as a predictor of the outcome of a given match. The greater the scoring difference, the more often the higher rated player is predicted to win.
The highest ranking professional Go players (9th dan) have a Elo rating on average of 2940. According to the Nature study, AlphaGo Zero was able to exceed this level of proficiency in less than five days. By the end of its 40-day training period the researchers calculated that AlphaGo Zero's Elo rating had exceeded 5,000.
An AI that can train itself without the need for human knowledge has wide-ranging applications, particularly in spaces where human expertise may be limited or altogether unavailable. Taken into the factory setting this could go beyond predictive maintenance and lead to machines that can not only diagnose and predict a problem but also repair themselves. Or imagine it being applied into medical applications as IBM's Watson is. What sort of diagnoses and novel treatments might an AI come up with if allowed to study disease and treatments without human knowledge?
Machine Learning That Writes Itself
Meanwhile, another research project at Google, AutoML, is showing that it is not only possible for AI to teach itselff but to program itself. AutoML (short for “automated machine learning”) allows for “controller” algorithms to create and train “child” models for specific tasks which can then be evaluated for their effectiveness. The feedback is then given back to the controller algorithm and it uses that to improve the child the next time around.
After thousands of repetitions of this (reinforcement learning again) the controller gets better at creating algorithms. Basically it throws a bunch of algorithms at a problem and figures out, based on feedback, which one (or combination) best handles it. When Google engineers tested AutoML they found that it actually created architectures similar to those that human engineers would design.
This approach still has the need for human intervention for feedback, but if it could be trained like AlphaGo Zero to recognize the parameters of a successful algorithm for a given task, it could someday be applied to creating some very complex AI.
In a blog post, Google CEO Sundar Pichai wrote that with AutoML Google is hoping to address the shortage of engineers who are experts in AI while also continuing to expand the use of the technology. By using AutoML engineers may some day be able to develop machine learning algorithms without the need of such a comprehensive knowledge of neural networks and deep learning. “Today, designing neural nets is extremely time intensive, and requires an expertise that limits its use to a smaller community of scientists and engineers,” Pichai wrote. “... We hope AutoML will take an ability that a few PhDs have today and will make it possible in three to five years for hundreds of thousands of developers to design new neural nets for their particular needs.”
Right now AutoML can only be applied to some basic tasks like image categorization so we shouldn't expect to see our factory robots or autonomous cars creating their own algorithms any time soon. But Google has demonstrated that such at thing could be feasible in the future. Having AI teach and program itself could also offer new insights into how AI works and may point engineers toward new methods of creating AI.
As the AlphaGo Zero team wrote in the Nature study, “ Humankind has accumulated Go knowledge from millions of games played over thousands of years, collectively distilled into patterns, proverbs, and books. In the space of a few days, starting tabula rasa, AlphaGo Zero was able to rediscover much of this Go knowledge, as well as novel strategies that provide new insights into the oldest of games.”
In a keynote presentation at ESC Silicon Valley, taking place Dec. 5-7, 2017, Gunnar Newquist, Founder & CEO of Brain2Bot Inc., explores how the next revolution in AI will come from an understanding of natural intelligence. Click here for more information on Gunnar's talk. Click here to register for the event today!
Chris Wiltz is a Senior Editor at Design News, covering emerging technologies including AI, VR/AR, and robotics.