Google has developed a machine learning system capable of mastering Go – an ancient Chinese game whose complexity stumped computers for decades.

While IBM’s Deep Blue computer mastered chess in the mid 1990s and in more recent years a system built by Google’s DeepMind lab has beaten humans at classic 70s arcade games – Go was a different matter.

Go has 200 moves per turn compared to 20 per turn in Chess. Over the course of a game of Go there are so many possible moves that searching through each of them to identify the best play is too costly from a computational point of view.

Now a system developed by Google DeepMind has beaten European Go champion and elite player Fan Hui. Rather than being programmed in how to play the game, the AlphaGo system learned how to do so using two deep neural networks and an advanced tree search.

Go is typically played on a 19-by-19-square board and sees players attempt to capture empty areas and surround an opponent’s pieces. To teach the system how to play the game, moves from 30 million Go games played by human experts were fed into AlphaGo’s neural networks. The system then used reinforcement learning to work out the type of moves that were most likely to succeed, based on these past matches. This approach allows AlphaGo to restrict the number of possible moves it needs to search through during a game – making the process more manageable.

DeepMind CEO Demis Hassibis described Go as “probably the most complex game that humans play. There’s more configurations of the board then there are atoms in the universe.”

It was that complexity that meant the game had been so difficult for machines to master said DeepMind’s David Silver. “In the game of Go we need this amazingly complex intuitive machinery, which people previously thought was only available in the human brain, to even have the right idea of who’s ahead and what the right move is.”

Google has suggested that the approach used by AlphaGo to learn how to master Go could be extended to solving more weighty problems, such as climate change modelling, as well as to improving Google’s interactions with users of its services.

For instance, DeepMind’s Silver suggests the technology could help personalize healthcare by using a similar reinforcement learning technique to understand which treatments would “lead to the best outcomes for individual patients based on their particular track record and history”.

More significantly, Hassabis sees the achievement as progress towards an even grander goal, of building an AI with the same general capabilities and understanding as humans.

“Most games are fun and were designed because they’re microcosms of some aspect of life. They might be slightly constrained or simplified in some way but that makes them the perfect stepping stone towards building general artificial intelligence.”

Similar AI initiatives are underway at tech giants across the world, with Facebook recently revealing its deep learning system’s ability to recognise people and things in images and to predict real-world outcomes, such as when a tower of blocks will topple.

Why Google is pursuing narrow, not general, AI

Dr Simon Stringer, director of the Oxford Centre for Theoretical Neuroscience and Artificial Intelligence, said that AlphaGo and other deep learning systems are good at specific tasks – be that spotting objects or animals in photos or mastering a game. But these systems work very differently from the human brain and shouldn’t be viewed as representing progress towards developing a general, human-like intelligence – which he believes requires an approach guided by biology.

“If you want to solve consciousness you’re not going to solve it using the sorts of algorithms they’re using,” he said.

“We all want to get to the moon. They’ve managed to get somewhere up this stepladder, ahead of us, but we’re only going to get there by building a rocket in the long term.

“They will certainly develop useful algorithms with various applications but there will be a whole range of applications that we’re really interested in that they will not succeed at by going down that route.”

In the case of DeepMind, Stringer says the reinforcement learning approach used to teach systems to play classic arcade games and Go has limitations compared to how animals and human acquire knowledge about the world.

Whereas these reinforcement learning algorithms can learn to map which actions lead to the best outcomes they are “model-free”, meaning the system “knows nothing about its world”.

That approach is very different to how a rat’s brain enables it to navigate a maze, he said.

“It’s been shown over a half a century ago that what rats do is learn about the structure of their environment, they learn about the spatial structure and the causal relations in their world and then, when they want to get from A to B, they juggle that information to create a novel sequence of steps to get to that reward.”

When you teach a system using model-free reinforcement learning, Stringer says it’s “behaviorally very limiting”.

“As the environment changes, for example one route is blocked off, the system doesn’t know anything about its world so it can’t say ‘This path is blocked, I’m going to take the next shortest one’. It can’t adapt but rats can.”

Similarly, Google’s announcement a few years back that it had trained a neural network to spot cats in images doesn’t represent a step towards developing a human-like vision system.

“When we look at a cat, we’re not just aware there’s a cat in the image, we see all of the millions of visual features that make up that cat and how they’re related to each other. In other words our visual experience is much richer than one of these deep learning architectures, which simply tells you whether there’s a particular kind of feature in an image.”

In particular, such systems lack the human ability to bind features together – he said – to comprehensively understand how features in an image are related to one another. Deep learning neural networks also generally don’t model biological systems that appear to play a key role in how humans assign meaning to the world. These models typically exclude, for example, feedback in the brain’s visual cortex and the precise timings in the electrical pulses between neurons, he said, adding that the centre in Oxford had developed concrete theories about the importance of these features in the visual cortex.

“We bought all of those elements together. At the very least it gives us a deep insight into what is so special about human vision that hasn’t been captured in artificial vision systems yet.”

This biologically-inspired approach is very different to that taken by DeepMind but Stringer believes it is necessary to have a chance of one day cracking general artificial intelligence.

The downside is that Stringer believes the ultimate payoff for his research will be a long time coming, a factor he thinks has driven DeepMind’s decision to focus on narrow AI that could be applicable in the near-future.

“I have to admit, I’m always a bit surprised, given the resources that DeepMind have, why they don’t devote more resources to actually trying to recreate the dynamics of brain function and I think it’s because when you’re trying to raise funding you need to produce jam today, you need these algorithms to work quickly otherwise that tap gets turned off.

“My aim is to produce the first prototypical conscious systems, something very simple, somewhere between a mouse and a rat, within the next 20 – 30 years.”

The DeepMind software that beat Go champion Hui, in a match that took place last October, was running on Google Cloud Platform and reportedly distributed across about 170 GPUs (graphics processing units) and 1,200 CPUs (central processing units).

The next major challenge for Google’s AlphaGo will come in March, when it will play the world’s reigning Go champion Lee Sedol.

DeepMind’s Silver is confident AlphaGo has what it takes to beat all comers, at least in the long run.

“A human can perhaps play 1,000 games a year, AlphaGo can play through millions of games every single day. It’s at least conceivable that as a result AlphaGo could, given enough processing, given enough training, given enough search power, reach a level that’s beyond any human.”

Read more about AI…