IMPLICATIONS OF IVAN PAVLOV’S GREAT DISCOVERY
STEPHEN M. KOSSLYN
Psychologist; director, Center for Advanced Study in the Behavioral Sciences, Stanford University
ROBIN ROSENBERG
Clinical psychologist; author, What’s the Matter with Batman?
It’s easy to imagine a politician’s objecting to federal funds going to study how dogs drool. But failing to support such research would have been very shortsighted indeed. As part of his Nobel Prize–winning research on digestion, the great Russian physiologist Ivan Pavlov (1849–1936) measured the amount of saliva produced when dogs were given food. In the course of this work, he and his colleagues noticed something unexpected: The dogs began salivating well before they were fed. In fact, they salivated when they first heard the approaching footsteps of the person coming to feed them. That core observation led to the discovery of classical conditioning.
The key idea behind classical conditioning is that a neutral stimulus (such as the sound of approaching footsteps) comes to be associated with a stimulus (such as food) that reflexively produces a response (such as salivation)—and after a while, the neutral stimulus elicits the response produced reflexively by the paired stimulus. To be clear about the phenomenon, we’ll need to take a few words to explain the jargon. The neutral stimulus becomes “conditioned,” and hence is known as the conditioned stimulus (CS), whereas the stimulus that reflexively produces the response is known as the unconditioned stimulus (UCS). And the response produced by the UCS is called the unconditioned response (UR). Classical conditioning occurs when the CS is presented right before a UCS, so that after a while the CS by itself produces the response. When this occurs, the response is called a conditioned response (CR). In short, at the outset a UCS (such as food) produces a UCR (such as salivation); when a CS (the sound of the feeder’s footsteps) is presented before the UCS, it soon comes to produce the response, a CR (salivation), by itself.
This simple process gives rise to a host of elegant and nonintuitive explanations.
For example, consider accidental deaths from drug overdoses. In general, narcotics users tend to take the drug in a specific setting, such as their bathroom. The setting initially is a neutral stimulus, but after someone takes narcotics in it a few times, the bathroom comes to function as a CS: As soon as the user enters the bathroom with narcotics, the user’s body responds to the setting by preparing for the ingestion of the drug. Specific physiological reactions allow the body to cope with the drug, and those reactions become conditioned to the bathroom (in other words, the reactions become a CR). To get a sufficient high, the user must now take enough of the narcotic to overcome the body’s preparation. But if the user takes the drug in a different setting, perhaps in a friend’s bedroom during a party, the CR does not occur—that is, the usual physiological preparation for the narcotic does not take place. Thus, the usual amount of the drug functions as if it were a larger dose and may be more than the user can tolerate without the body’s preemptive readiness. Hence, although the process of classical conditioning was formulated to explain very different phenomena, it can be extended to explain why drug overdoses sometimes accidentally occur when usual doses are taken in new settings.
By the same token, classical conditioning plays a role in the placebo effect: The analgesics regularly used by many of us, such as ibuprofen or aspirin, begin to take effect well before their active ingredients have time to kick in. Why? From previous experience, the mere act of taking that particular pill has become a CS, which triggers the pain-relieving processes invoked by the medicine itself (and those processes have become a CR).
Classical conditioning also can result from an implanted defibrillator, or pacemaker. When the heart beats too quickly, this device shocks it, causing it to revert to beating at a normal rate. Until the shock level is properly calibrated, the shock can be very uncomfortable and function as a UCS, producing fear as a UCR. Because the shock does not occur in a consistent environment, the person associates random aspects of the environment with it—which then function as CS’s. And when any of those environmental aspects are present, the person can experience severe anxiety, awaiting the possible shock.
This same process explains why you find a particular food unappealing once it’s given you food poisoning. It can thus come to function as a CS, and if you eat it—or even think about eating it—you may feel queasy, a CR. You may find yourself avoiding that food, and thus a food aversion is born. In fact, simply pairing pictures of particular types of food (such as French fries) with aversive photographs (such as of a horribly burned body) can change how appealing you find that food.
Thus Pavlov’s discovery of anticipatory salivation can be easily extended to a wide range of phenomena. But that said, we should point out that his original conception of classical conditioning was not quite right. He thought that sensory input was directly connected to specific responses, leading the stimuli to produce the response automatically. We now know that the connection is not so direct; classical conditioning involves many cognitive processes, such as attention and those underlying interpretation and understanding. In fact, classical conditioning is a form of implicit learning. As such, it allows us to navigate through life with less cognitive effort (and stress) than would otherwise be required. Nevertheless, this sort of conditioning has by-products that can be powerful, surprising, and even sometimes dangerous.
NATURE IS CLEVERER THAN WE ARE
TERRENCE J. SEJNOWSKI
Computational neuroscientist; Francis Crick Professor, the Salk Institute; coauthor (with Patricia S. Churchland), The Computational Brain
We have the clear impression that our deliberative mind makes the most important decisions in our life: what work we do, where we live, whom we marry. But contrary to this belief, the biological evidence points toward a decision process in an ancient brain system called the basal ganglia, brain circuits that consciousness cannot access. Nonetheless, the mind dutifully makes up plausible explanations for the decisions.
The scientific trail that led to this conclusion began with honeybees. Worker bees forage the spring fields for nectar, which they identify with the color, fragrance, and shape of a flower. The learning circuit in the bee brain converges on VUMmx1, a single neuron that receives the sensory input and, a bit later, the value of the nectar, and learns to predict the nectar value of that flower the next time the bee encounters it. The delay is important, because the key is prediction, rather than a simple association. This is also the central core of temporal-difference (TD) learning, which entails learning a sequence of decisions leading to a goal and is particularly effective in uncertain environments, like the world we live in.
Buried deep in your midbrain, there’s a small collection of neurons—found in our earliest vertebrate ancestors, and projecting throughout the cortical mantle and basal ganglia—that are important for decision making. These neurons release a neurotransmitter called dopamine, which has a powerful influence on our behavior. Dopamine has been called a “reward molecule,” but more important than reward itself is the ability of these neurons to predict reward: If I had that job, how happy would I be? Dopamine neurons, which are central to motivation, implement TD learning, just as VUMmx1 does.
TD learning solves the problem of finding the shortest path to a goal. It’s an on-line algorithm, because it learns by exploring and discovers the value of intermediate decisions in reaching the goal. It does this by creating an internal value function, which can be used to predict the consequences of actions. Dopamine neurons evaluate the current state of the entire cortex and inform the brain about the best course of action from the current state. In many cases, the best course of action is a guess, but because guesses can be improved, TD learning creates, over time, a value function of oracular powers. Dopamine may be the source of the “gut feeling” you sometime experience, the stuff that intuition is made from.
When you’re considering various options, prospective brain circuits ev
aluate each scenario, and the transient level of dopamine registers the predicted value of each decision. The level of dopamine is also related to your level of motivation, so not only will a high level of dopamine indicate a high expected reward, but you will also have a higher level of motivation to pursue it. This is quite literally the case in the motor system, where a higher tonic dopamine level produces faster movements. The addictive power of cocaine and amphetamines is a consequence of increased dopamine activity, hijacking the brain’s internal motivation system. Reduced levels of dopamine lead to anhedonia, an inability to experience pleasure; and the loss of dopamine neurons results in Parkinson’s disease, an inability to initiate actions and thoughts.
TD learning is powerful because it combines information about value along many different dimensions, in effect comparing apples and oranges in achieving distant goals. This is important because rational decision making is very difficult when there are many variables and unknowns. Having an internal system that quickly delivers good guesses is a great advantage, and may make the difference between life and death when a quick decision is needed. TD learning depends on the sum of your life experiences. It extracts what is essential from these experiences long after the details of the individual experiences are no longer remembered.
TD learning also explains many of the experiments performed by psychologists who trained rats and pigeons in simple tasks. Reinforcement learning algorithms have traditionally been considered too weak to explain complex behaviors, because the feedback from the environment is minimal. Nonetheless, reinforcement learning is universal among nearly all species and is responsible for some of the most complex forms of sensorimotor coordination, such as piano playing and speech. Reinforcement learning has been honed by hundreds of millions of years of evolution. It has served countless species well, particularly our own.
How complex a problem can TD learning solve? TD-Gammon is a computer program that learned how to play backgammon by playing itself. The difficulty with this approach is that the reward comes only at the end of the game, so it’s not clear which were the good moves that led to the win. TD-Gammon started out with no knowledge of the game, except for the rules. By playing itself many times and applying TD learning to create a value function to evaluate game positions, TD-Gammon climbed from beginner to expert level, along the way picking up subtle strategies similar to ones that humans use. After playing itself a million times, it reached championship level and was discovering new positional play that astonished human experts. Similar approaches to the game of Go have achieved impressive levels of performance and are on track to reach professional levels.
When there’s a combinatorial explosion of possible outcomes, selective pruning is helpful. Attention and working memory allow us to focus on most of the important parts of a problem. Reinforcement learning is also supercharged by our declarative memory system, which tracks unique objects and events. When large brains evolved in primates, the increased memory capacity greatly enhanced their ability to make complex decisions, leading to longer sequences of actions to achieve goals. We are the only species to create an educational system and to consign ourselves to years of instruction and tests. Delayed gratification can extend into the distant future (in some cases, into an imagined afterlife), a tribute to the power of dopamine to control behavior.
At the beginning of the cognitive revolution in the 1960s, the brightest minds could not imagine that reinforcement learning could underlie intelligent behavior. Minds are not reliable. Nature is cleverer than we are.
IMPOSING RANDOMNESS
MICHAEL I. NORTON
Associate professor of business administration and Marvin Bower Fellow, Harvard Business School
Paul Meier, who passed away in 2011, was primarily known for his introduction of the Kaplan-Meier estimator. But Meier was also a seminal figure in the widespread adoption of an invaluable explanatory tool: the randomized experiment. The decided unsexiness of the term masks a truly elegant form, which in the hands of its best practitioners approaches art. Simply put, experiments offer a unique and powerful means for devising answers to the question that scientists across disciplines seek to answer: How do we know whether something works?
Take a question that appears anew in the media each year: Is red wine good or bad for us? We learn a great deal about how red wine works by asking people about their consumption and health and looking for correlations between the two. To estimate the specific impact of red wine on health, though, we need to ask people a lot of questions—about everything they consume (food, prescription medication, more unsavory forms of medication), their habits (exercise, sleep, sexual activity), their past (their health history, their parents’ and grandparents’ health histories), and on and on—and then try to control for these factors to isolate the impact of wine on health. Think of the length of the survey.
Randomized experiments completely reengineer how we go about understanding how red wine works. We take it as a given that people vary in the manifold ways described above (and others), but we cope with this variance by randomly assigning people to either drink red wine or not. If people who eat doughnuts and never exercise are equally likely to be in the “wine treatment” or the “control treatment,” then we can do a decent job of assessing the average impact of red wine over and above the likely impact of other factors. It sounds simple because, well, it is—but anytime a simple technique yields so much, “elegant” is a more apt description.
The rise of experiments in the social sciences beginning in the 1950s—including Meier’s contributions—has exploded in recent years with the adoption of randomized experiments in fields ranging from medicine (testing interventions, like cognitive behavioral therapy) to political science (running voter-turnout experiments) to education (assigning kids to be paid for grades) to economics (encouraging savings behavior). The experimental method has also begun to filter into and impact public policy: President Obama appointed behavioral economist Cass Sunstein to head the Office of Information and Regulatory Affairs, and Prime Minister David Cameron instituted a Behavioural Insights Team.
Randomized experiments are by no means a perfect tool for explanation. Some important questions simply do not lend themselves to randomized experiments, and the method in the wrong hands can cause harm, as in the infamous Tuskegee syphilis experiment. But their increasingly widespread application speaks to their flexibility in informing us how things work and why they work that way.
THE UNIFICATION OF ELECTRICITY AND MAGNETISM
LAWRENCE M. KRAUSS
Physicist/cosmologist, Arizona State University; author, A Universe from Nothing
No explanation I know of in recent scientific history is as beautiful or deep, or ultimately as elegant, as the 19th-century explanation of the remarkable connection between two familiar but seemingly distinct forces in nature—electricity and magnetism. It represents to me all that is best about science: It combined surprising empirical discoveries with a convoluted path to a remarkably simple and elegant mathematical framework, which explained far more than was ever bargained for and in the process produced the technology that powers modern civilization.
Strange experiments with jumping frogs and electric circuits eventually led to the serendipitous discovery, by the self-schooled yet greatest experimentalist of his time, Michael Faraday, of a strange connection between magnets and electric currents. By then, it was well known that a moving electric charge (or current) created a magnetic field around itself that could repel or attract other nearby magnets. What remained an open question was whether magnets could produce any electric force on charged objects. Faraday discovered, by accident, that when he turned a switch on or off to start or stop a current, creating a magnetic field that grew or decreased with time, during the periods when the magnetic field was changing, a force would suddenly arise in a nearby wire, moving the electric charges within it to create a current.
Faraday’s law of induction, as it became known, not only is responsible for
the basic principle governing all electric generators from Niagara Falls to nuclear power plants but also produced a theoretical conundrum that required the mind of the greatest theoretical physicist of his time, James Clerk Maxwell, to set things straight. Maxwell realized that Faraday’s result implied that it was the changing magnetic field (a pictorial concept introduced by Faraday himself because he felt more comfortable with pictures than algebra) that produced an electric field that pushed the charges around the wire, thereby creating a current.
Achieving mathematical symmetry in the equations governing electric and magnetic fields then required that a changing electric field and not merely moving charges would produce a magnetic field. This not only produced a set of mathematically consistent equations every physics student knows (and some love) called Maxwell’s equations, which can fit on a T-shirt, but it established the physical reality of what was otherwise a figment of Faraday’s imagination: a field—that is, some quantity associated with every point in space and time.
Moreover, Maxwell realized that if a changing electric field produced a magnetic field, then a constantly changing electric field, such as occurs when you continuously jiggle a charge up and down, would produce a constantly changing magnetic field. That, in turn, would create a constantly changing electric field, which would create a constantly changing magnetic field, and so on. This field “disturbance” would move out from the original source (the jiggling charge) at a rate that Maxwell could calculate on the basis of his equations. The parameters in these equations came from experiment—from measuring the strength of the electric force between two known charges and the strength of the magnetic force between two known currents.
This Explains Everything Page 28