Architects of Intelligence
Page 52
Take Go for example. AI researchers have long believed that playing Go would require some kind of sophisticated pattern recognition, but they didn’t necessarily understand that it could be solved using the same kind of pattern recognition approaches you would use for perception problems in vision and speech. However, now people have shown that you can take neural networks, the same kind that were developed in those more traditional pattern recognition domains, and you can use them as part of a solution to playing Go, as well as chess, or similar board games. I think those are interesting models because they use what we’re calling deep learning here, but they don’t just do that, they also use traditional game tree search and expected value calculations, and so on. AlphaGo is the most striking and best-known success of deep learning AI, and it’s not even a pure deep learning system. It uses deep learning as part of a system for playing a game and searching a game tree.
That already represents the way that deep learning expands beyond deep neural networks, but still, the secret sauce that makes it work so well is a deep neural network and the methods of training it. Those methods are finding patterns in the structure of gameplay that go way beyond the patterns people were able to find in an automatic way before. If you look beyond any one task, like playing Go or playing chess, to the broader problems of intelligence, though, the idea that you’re going to turn all of the intelligence into a pattern recognition problem is ridiculous, and I don’t think any serious person can believe that. I mean maybe some people will say that, but that just seems crazy to me.
Every serious AI researcher has to think two things simultaneously. One is they have to recognize that deep learning and deep neural networks have contributed a huge amount to what we can do with pattern recognition, and that pattern recognition is going to be a part of probably any intelligent system’s success. At the same time, you also have to recognize that intelligence goes way beyond pattern recognition in all the ways I was talking about. There are all these activities of modeling the world, such as explaining, understanding, imagining, planning, and building out new models, and deep neural networks don’t really address that.
MARTIN FORD: Is that limitation one of the things you’re addressing in your work?
JOSH TENENBAUM: Well, with my work I’ve been interested in finding the other kinds of engineering tools that we need to address the aspects of intelligence that go beyond pattern recognition. One of the approaches is to look to earlier waves of ideas in the field, including the ideas of graphical models and Bayesian networks, which were the big thing when I got into the field. Judea Pearl is probably the most important name associated with that era of the field.
Perhaps most important of all is the earliest wave, often called “symbolic AI,” Many people will tell a story that in the early days of AI we thought intelligence was symbolic, but then we learned that was a terrible idea. It didn’t work, because it was too brittle, couldn’t handle noise and couldn’t learn from experience. So we had to get statistical, and then we had to get neural. I think that’s very much a false narrative. The early ideas that emphasize the power of symbolic reasoning and abstract languages expressed in formal systems were incredibly important and deeply right ideas. I think it’s only now that we’re in the position, as a field, and as a community, to try to understand how to bring together the best insights and the power of these different paradigms.
The three waves in the field of AI—the symbolic era, the probabilistic and causal era, and the neural networks era—are three of our best ideas on how to think about intelligence computationally. Each of these ideas has had their rise and fall, with each one contributing something, but neural networks have really had their biggest successes in the last few years. I’ve been interested in how we bring these ideas together. How do we combine the best of these ideas to build frameworks and languages for intelligent systems and for understanding human intelligence?
MARTIN FORD: Do you imagine a hybrid that would bring together neural networks and other more traditional approaches to build something comprehensive?
JOSH TENENBAUM: We don’t just imagine it, we actually have it. Right now, the best examples of these hybrids go by the name of probabilistic programming. When I give talks or write papers, I often point to probabilistic programming as the general tool that I’m using in my work. It’s one that some people know about. It’s not nearly as broadly embraced to think about AI as neural networks are, but I think it’s going to be increasingly recognized in its own form.
All these terms, like neural networks or probabilistic programming, are only vague terms that continually redefine themselves as the people working with these toolsets learn more about what works, what doesn’t work, and what other things they need. When I talk about probabilistic programs, I sometimes like to say that they have about as much to do with probability as neural networks have to do with neurons. Namely, neural networks were inspired by early abstractions of how a neuron works, and the idea that if you wire neurons together into a network, whether it’s biological or artificial, and you make that network complicated enough in certain ways, that it becomes very powerful. The core meaning of a neuron stays around, there are basic processing units that take linear combinations of their inputs and pass them through a non-linearity, but if you look at the ways in which people are using neural networks now, they go way beyond any kind of actual neuroscience inspiration. In fact, they bring ideas from probability and from symbolic programs into them. I would say probabilistic programs are just approaching that same kind of synthesis but coming from a different direction.
The idea of probabilistic programs starts from work that people did in the 1990s where they tried to build systematic languages for large-scale probabilistic reasoning. People realized that you needed to have tools that didn’t just do probabilistic reasoning, but also had abstract, symbolic components that were more like earlier eras of AI, in order to capture real common-sense knowledge. It wasn’t enough to work with numbers, you had to work with symbols. Real knowledge is not just about trading off numbers for other numbers, which is what you do in probability theory, it’s about expressing abstract knowledge in symbolic forms, whether it’s math, programming languages, or logic.
MARTIN FORD: So, this is the approach that you’ve been focusing on?
JOSH TENENBAUM: Yes, I was very lucky to work with students and postdocs in my group in the mid to late 2000s, especially Noah Goodman, Vikash Mansinghka, and Dan Roy, where we built a language that we called Church, named after Alonzo Church. It was an example of bringing together higher-order logic languages based on what we call the lambda calculus, which was Church’s framework for universal computation. That’s really the underlying formal basis of computer programming languages like Lisp and Scheme.
We took that formalism for representing abstract knowledge and used that to generalize patterns of probabilistic and causal reasoning. That turned out to be very influential for both myself and others in terms of thinking about how to build systems that had a common-sense reasoning capacity—systems that really reasoned and didn’t just find patterns in data, and that could have abstractions that could generalize across many situations. We used these systems to capture, for example, people’s intuitive theory of mind—how we understand other people’s actions in terms of their beliefs and desires.
Using these tools of probabilistic programs over the last ten years, we were able to build for the first time reasonable, quantitative, predictive, and conceptually correct models of how humans, even young children, understand what other people are doing, and see people’s actions not just as movements in the world, but rather as the expressions of rational plans. We were also able to look how people can work backward from seeing people move around in the world to figure out what they want and what they think, to infer their beliefs and desires. That’s an example of core common-sense reasoning that even young babies engage in. It’s part of how they really break into intelligence; they see other people doing something and th
ey try to figure out why they are doing it and whether it’s a good guide for what they should do. To us, these were some of the first really compelling applications of these ideas of probabilistic programs.
MARTIN FORD: Can these probabilistic methods be integrated with deep learning?
JOSH TENENBAUM: Yes, in the last couple of years, people have taken that same toolset and started to weave in neural networks. A key challenge for these probabilistic programs, as we were building them ten years ago and continue today, is that inference is difficult. You can write down probabilistic programs that capture people’s mental models of the world, for example, their theory of mind or their intuitive physics, but actually getting these models to make inferences very fast from the data that you might infer is a hard challenge algorithmically. People have been turning to neural networks and other kinds of pattern recognition technology as a way of speeding up inference in these systems. In the same way, you could think of how AlphaGo uses deep learning to speed up inference and search in a game tree. It’s still doing a search in the game tree, but it uses neural networks to make fast, quick, and intuitive guesses that guide its search.
Similarly, people are starting to use neural networks to find patterns in inference that can speed up inferences in these probabilistic programs. The machinery of neural networks and the machinery of probabilistic programs are increasingly coming to look a lot like each other. People are developing new kinds of AI programming languages that combine all these things, and you don’t have to decide which to use. They’re all part of a single language framework at this point.
MARTIN FORD: When I talked to Geoff Hinton, I suggested a hybrid approach to him, but he was very dismissive of that idea. I get the sense that people in the deep learning camp are perhaps thinking not just in terms of an organism learning over a lifetime, but in terms of evolution. The human brain evolved over a very long time, and in some earlier form or organism it must have been much closer to being a blank slate. So perhaps that offers support for the idea that any necessary structure might naturally emerge?
JOSH TENENBAUM: There’s no question that human intelligence is very much the product of evolution, but by that, we also have to include biological evolution and cultural evolution too. A huge part of what we know, and how we know what we know, comes from culture. It’s the accumulation of knowledge across multiple generations of humans in groups. There’s no question that a baby who just grew up on a desert island with no other humans around would be a lot less intelligent. Well, they might be just as intelligent in some sense, but they would know a lot less than we know. They would also in a strict sense be less intelligent because a lot of the ways in which we are intelligent, our systems of thinking—whether it’s mathematics, computer science, reasoning, or other systems of thought that we get through languages—are more generally the accumulation of many smart people over many generations.
It’s very clear when we look at our bodies that biological evolution has built incredibly complex structures with amazing functions. There’s no reason to think that the brain is any different. When we look at the brain, it’s not as obvious what are the complex structures in the real neural networks that evolution has built, and it is not just a big blank slate mess of randomly wired connections.
I don’t think any neuroscientist thinks that the brain is anything like a blank slate at this point. Real biological inspiration has to take seriously that at least in any one individual brain’s lifetime, there’s a huge amount of structure that’s built in, and that structure includes both our most basic models for understanding the world, and also the learning algorithms that grow our models beyond that starting point.
Part of what we get genetically, as well as culturally, are ways of learning that are much more powerful, much more flexible, and much faster than the kinds of learning that we have in deep learning today. These methods allow us to learn from very few examples and to learn new things much more quickly. Anyone who looks and takes seriously the way real human babies’ brains start and how children learn, has to think about that.
MARTIN FORD: Do you think deep learning could succeed at achieving more general intelligence by modeling an evolutionary approach?
JOSH TENENBAUM: Well, a number of people at DeepMind and others who follow the deep reinforcement learning ethos would say they’re thinking about evolution in a more general sense, and that’s also a part of learning. They’d say their blank slate systems are not trying to capture what a baby does, but rather what evolution has done over many generations.
I think that’s a reasonable thing to say, but then my response to that would be to also look to biology for inspiration, which is to say, okay, fine, but look at how evolution actually works. It doesn’t work by having a fixed network structure and doing gradient descent in it, which is the way today’s deep learning algorithms work; rather evolution actually builds complex structures, and that structure building is essential for its power.
Evolution does a lot of architecture search; it designs machines. It builds very differently, structured machines across different species or over multiple generations. We can see this most obviously in bodies, but there’s no reason to think it’s any different in brains. The idea that evolution builds complex structures that have complex functions, and it does it by a process which is very different to gradient descent, but rather something more like search in the space of developmental programs, is very inspiring to me.
A lot of what we work on here is to think about how you view learning or evolution as something like search in a space of programs. The programs could be genetic programs, or they could be cognitive-level programs for thinking. The point is, it doesn’t look like gradient descent in a big fixed network architecture. You could say, we’re going to just do deep learning in neural networks, and say that’s trying to capture what evolution does, and not what human babies do, but I don’t think it’s really what human babies or evolution does.
It is, however, a toolkit that has been highly optimized for, especially by the tech industry. People have shown you can do valuable things with big neural networks when you amplify them with GPUs and then with big distributed computing resources. All the advances that you see from DeepMind or Google AI, to name two, are essentially enabled by these resources, and a great program of integrated software and hardware engineering building them out specifically to optimize for deep learning. The point I’m making is that when you have a technology that Silicon Valley has invested a large amount of resources in optimizing for, it becomes very powerful. It makes sense for companies like Google to pursue that route to see where you can go with it. At the same time, I’m just saying when you look at how it works in biology, either in the lifetime of an individual human or over evolution, it really looks rather different from that.
MARTIN FORD: What do you think of the idea of a machine being conscious? Is that something that logically comes coupled with intelligence, or do you think that’s something entirely separate?
JOSH TENENBAUM: That’s a hard thing to discuss because the notion of consciousness means many different things to different people. There are philosophers, as well as cognitive scientists and neuroscientists who study it in a very serious and in-depth way, and there’s no shared agreement on how to study it.
MARTIN FORD: Let me rephrase it, do you think that a machine could have some sort of inner experience? Is that possible or likely or even required for general intelligence?
JOSH TENENBAUM: The best way to answer that is to tease out two aspects of what we mean by consciousness. One is what people in philosophy have referred to as the sense of qualia or the sense of subjective experience that is very hard to capture in any kind of formal system. Think of the redness of red; we all know that red is one color and green is another color, and we also know that they feel different. We take for granted that other people when they see red, they not only call it red, but they experience subjectively the same thing we do. We know it’s possible to build a mac
hine that has those kinds of subjective experiences because we are machines and we have them. Whether we would have to do that, or whether we would be able to do that in the machines that we’re trying to build right now, it’s very hard to say.
There’s another aspect of what we could call consciousness, which is what we might refer to as the sense of self. We experience the world in a certain kind of unitary way, and we experience ourselves being in it. It’s much easier to say that those are essential to human-like intelligence. What I mean by this is that when we experience the world, we don’t experience it in terms of tens of millions of cells firing.
One way to describe the state of your brain at any moment is at the level of what each neuron is doing, but that’s not how we subjectively experience the world. We experience the world as consisting of objects, and all of our senses come together into a unitary understanding of those things. That’s the way we experience the world, and we don’t know how to link that level of experience to neurons. I think if we’re going to build systems that are human-level intelligence, then they’re going to have to have that kind of unitary experience of the world. It needs to be at the level of objects and agents, and not at the level of firings of neurons.
A key part of that is the sense of self—that I’m here, and that I’m not just my body. This is actually something that we’re actively working on in research right now. I’m working with the philosopher Laurie Paul and a former student and colleague of mine, Tomer Ullman, on a paper which is tentatively called Reverse Engineering the Self.
MARTIN FORD: Along the same lines as reverse engineering the mind?