Skip to content

Deep thinking: Making machines better learners

The journey towards creating artificial intel­ligence has been slower and more difficult than early pioneers predicted. Modern computers are truly impressive in many ways.

But they don’t hold a candle to the human brain when it comes to picking a face out of a crowd, understanding a pun, composing a symphony, or any of hundreds of things humans do.

But a new approach pioneered by CIFAR fellows is shaking up the world of artificial in­telligence. Over the past decade the fellows have championed a technique called deep learning and made it one of the hottest areas in artificial intelligence.

If you use the voice recognition feature on an Android phone you already benefit from deep learning networks, which improved voice recognition by 25 per cent over the best exist­ing techniques. A similar improvement in im­age recognition led Google to implement deep learning techniques on their Google+ service last year.

In fact, the Internet giants are happily snatch­ing up CIFAR fellows to work on their artificial intelligence efforts. Last year Google recruit­ed Geoffrey Hinton, until recently director of CIFAR’s Learning in Machines & Brains program (formerly known as Neural Computation & Adaptive Perception), to work in its artificial intelligence laboratory. Shortly af­ter, Facebook followed suit, hiring Senior Fel­low Yann LeCun to set up a new artificial in­telligence laboratory for them. And since 2011 Senior Fellow Andrew Ng has been at Google, where he developed the Google Brain neural network.

Geoff Hinton

“NCAP was crucial,” says Hinton. “The fun­damental idea of CIFAR, which is to get the best people and put them in contact where they can exchange ideas, worked really well.”

The digital giants are interested in deep learning because the technique promises to allow their computers to sift through mil­lions of photos and videos and describe them as accurately as any human could; or to understand natural language beyond the level of simple keyword searching; or, perhaps, to make bet­ter predictions about which ads we’re likely to click on.

“The interest of large companies in ar­tificial intelligence is really focused to­day on deep learning,” says LeCun. “And deep learning was basically a CIFAR-funded conspiracy.”

From pong to neurons

From the most primitive game of Pong to the most sophisticated supercomputer climate mod­el, conventional computer programs consist of precisely-written, step-by-step instructions that have to be carried out exactly as written. Although these programs can be fantastically complex, they consist of steps written down by a human programmer who had to figure out just what he wanted the program to do and how he wanted it done.

But as early as the 1950s, some computer scientists became interested in another direc­tion. They began to experiment with artificial neural networks, loosely modelled on the work­ings of the human brain. Rather than being pro­grammed, these networks were trained, learning from experience to arrive at the right answer.

It seemed like a good idea. Consider all of the ways a picture of a cat can look. It can be in dif­ferent colours, taken from different angles, by itself or in the frame with other objects or ani­mals, etc. The neural network between our ears does a great job of picking out cats. But how do you write an algorithm describing a step-by-step process of recognizing a cat?

The promise of neural networks was that you wouldn’t have to. You could simply show the neural network a lot of pictures of cats and let it learn what they looked like.

Yann LaCun
Yann LaCun

There was a lot of interest, but there were also a lot of problems. For one thing, the networks weren’t always easy to train. You had to collect a lot of data and label it – picture a team of grad students getting together hundreds or thousands of photos and making sure the ones labeled “cat” really had a cat, and the ones labeled “not cat” really didn’t. Then after you’d trained the net­work, you had to use even more examples to make sure the network worked on photos it hadn’t been trained on.

Neural nets could be tough to train for other reasons. When they failed to work it could be hard to figure out why. Or they would seem to work, only to reveal later that they had been “over­trained” – they might have only learned to recog­nize some other common feature, like a random pattern of pixels that the particular collection of cat photos accidentally had in common.

After a surge of research in the 1980s, inter­est in neural nets had largely fallen away by the 1990s, to be replaced by other forms of machine learning. In fact, one respected journal was said to have stopped considering any paper with the term “neural network” in the title.

Obviously wrong

But Hinton didn’t give up on neural nets. He says it made sense to look towards the workings of the human brain to figure out better ways of achieving machine learning. After all, if we want to teach computers to perceive things the way we do, why not use a model that has been shaped by evolution to do just that? He wasn’t put off by the consensus of the field – he grew up as an athe­ist in a Christian school, and was used to trusting his own beliefs.

“When people said it’s irrelevant how the brain works, they were just utterly and obviously wrong,” Hinton says.

In 2004 Hinton and a number of other research­ers including Senior Fellow Yoshua Bengio (McGill) formed CIFAR’s NCAP program and immediately began discussing how to make neu­ral networks work better.

“It was a matter of time, but we had to con­vince the community that it was worth the effort to work on this,” LeCun says.

The community finally began to sit up and take notice in 2006 when Hinton and colleagues published a paper called “A fast learning algo­rithm for deep belief nets” in the journal Neural Computation. The paper described a new way to design better “deep” neural networks – that is, neural networks with three or more “hidden” lay­ers between the input layer and the output layer.

The new technique trained one layer of the network at a time. Neurons in the first layer would learn to represent some feature of the data, for instance to distinguish a horizontal line. When the first layer had learned some­thing, the data would be passed on to the next layer, which would learn to represent some other feature of the data – perhaps combining two or more shapes to learn to recognize an eyebrow. The next layer might contain a neuron that rec­ognized the combination of an eyebrow and an eye. Essentially, each higher layer of the network would learn to operate at a higher and higher level of abstraction.

Even more cats

Possibly even more exciting was that Hinton’s paper showed that these networks didn’t need to be supervised. You could set them loose on an un­labeled collection of images, and they could learn to recognize relevant features for themselves. Af­ter the initial learning was done you could come along and fine tune the process and add labels, telling the network that this image was a car, this one an airplane, etc.

“If you think about how babies learn,” says LeCun, “they learn by themselves the notion of objects, the properties of objects, without being told specifically what those objects are. It’s only later that we give names to the ob­jects. So most of the learning takes place in an unsupervised manner.”

In fact, last year Andrew Ng made news with a Google network that did this on a giant scale. Working with colleague Jeff Dean and the Google “brain team,” he created a deep learning network composed of 16,000 computer proces­sors, and fed it 10 million images extracted at random from the Internet, with no labels. The network taught itself to recognize human faces, human bodies – and yes, cats.

Yoshua Bengio

Hinton’s 2006 paper created a groundswell of interest in neural networks, and researchers be­gan working on them again.

Hinton says that part of the recent success of neural networks comes from the huge advances in both computing power and availability of data. Neural nets that were too complex to be practical on older, slower processors hummed along perfectly on modern work stations. And bigger computer memories and the Internet made it much easier to get big data sets to train the networks on.

“The main issue was that there wasn’t enough data and the computers weren’t fast enough,” Hinton says. “As soon as we got 1,000 times the data and computers a million times as fast, neu­ral networks started beating everything else.”

What lies ahead

The NCAP program was created with the dual interest of figuring out how neurons can be used in computation, and how computers can be made to perceive patterns. In many ways, Hinton says, vision is a perfect problem for machine learning. We actually know a lot about how the brain pro­cesses vision, compared, for instance, to how it processes language.

As he continues his research, he sees two ma­jor challenges. First, he wants to push forward on unsupervised learning. Second, he has to tackle the problem of how to make neural networks work at larger and larger scales.

First is the unsupervised learning problem. Although Hinton’s 2006 paper gained a lot of at­tention because of the promise of unsupervised learning, once researchers dusted off the old neural nets and started running them on modern computers with big data sets, they realized that the old techniques worked well enough. Most of the neural net applications in use now were actu­ally trained with labeled data. Nevertheless, su­pervised learning still has limitations.

“What we really want is something that will be like a person and will just understand the world. And there, unsupervised learning will be crucial,” Hinton says.

“For example, we’d like to be able to under­stand every video on Youtube. It would be nice if you could say, find me a video of a cat trying to jump on a shelf and falling. A person would understand exactly what you were saying. Right now, maybe machine learning methods could say there’s probably a cat in this video. They might even be able to say there’s a shelf. But the idea of a cat trying to jump on a shelf and fail­ing, they don’t understand that. My prediction is that over the next five years, we’ll be able to understand that.”

The other problem is how to make neural networks “scale” – that is, how to make them work efficiently as they get bigger and bigger. Right now, Hinton says, the computing power you need is roughly the square of the speed in­crease you want. In other words, twice as much speed requires four times the computing power; 10 times the speed requires 100 times the com­puting power.

Hinton will split his time between the Uni­versity of Toronto and Google, spending four months of the year at the corporation’s headquar­ters in Mountain View, as well as working in its Toronto office. He says he’s looking forward to the resources Google has – especially the data – as well as the researchers.

“They’ve got really smart people there, and they’ve got very interesting problems there. It’s quite nice when you do something for it to be in a billion Android phones.”

At Facebook, LeCun will have a similar situ­ation. He’ll continue to teach at NYU, and will be able to set up the Facebook artificial intel­ligence lab practically across the street from his campus office. “The nice thing about Facebook is that if you come up with a better way of un­derstanding natural language, or image recog­nition, its not like you have to create a whole business around it. It’s just a matter of setting it up for 1.3 billion users.”

Although Hinton has stepped down as direc­tor of NCAP, LeCun and Bengio have stepped up as co-directors of the program. LeCun says the program will continue to explore improvements in deep learning networks.

“There’s no question in my mind that the problem of learning representations of the world will have to be solved by an AI system we build. Deep learning is the only answer we have to that problem now,” he says.

Support Us

CIFAR is a registered charitable organization supported by the governments of Canada, Alberta and Quebec, as well as foundations, individuals, corporations and Canadian and international partner organizations.

MaRS Centre, West Tower
661 University Ave., Suite 505
Toronto, ON M5G 1M1 Canada