More Than Good Intentions

Home > Other > More Than Good Intentions > Page 4
More Than Good Intentions Page 4

by Dean Karlan


  No Way Forward?

  As we saw in chapter 1, economists Jeffrey Sachs and Bill Easterly have butted heads for years over a very simple but elusive question: Does aid really work? At the root of their differences is a disagreement over what constitutes “evidence,” and that’s the rub. Until recently, the debate about aid effectiveness has been tied up in complicated econometrics and a mire of controversial country-level data. The cutting-edge research that IPA has done in evaluating the effectiveness of specific development programs is finally giving us a new way to think about this question.

  The next step in resolving the aid debate isn’t more arguments from podiums and isn’t back-office analysis of huge country-level data sets. It’s far simpler, and more direct: Find individual programs that work, and support them. Find programs that don’t work, and stop doing them. And observe the patterns of both to learn which conditions are conducive to success, so that our first attempts at designing solutions get better and better.

  To do this, we need to get on the ground and work directly with development practitioners on evaluations. As early as the 1970s, economists were doing rigorous evaluations of specific social programs like job training and work-incentive taxation with the Department of Labor here in the United States. But for some reason—perhaps because we tend to be less demanding as donors than we are as taxpayers—the practice never took off in development. Until very recently, with virtually no hard evidence to guide us in choosing which tools to use in the fight against poverty, we were flying blind.

  Consider this analogy: For thousands of years, there was a general consensus in the medical community that the best way to treat hundreds of ailments, from acne to cancer to insanity, was by bloodletting. Sure, there were variations across doctors—some favored lances, others leeches—but they agreed on the basic principles: People were ill because of toxins in their blood, and the way to fix the problem was to bleed them out. Only in the mid-nineteenth century, with the advent of scientific medicine, did the practice begin to fall into disfavor. The reason? Somebody finally proved rigorously that it didn’t work.

  The sad fact is that much of the work being done around the world to fight poverty is in a sense like bloodletting. There is a wealth of conviction and some agreement about the driving principles—people are in need, and we should provide them with something to help—but that’s about the extent of it. The process of systematic testing, and the corresponding refinement of methods and treatments, is just beginning.

  The coming chapters will let you know what we’ve learned so far about what works, and what does not—and give you the basics of how we tell them apart. I’ll try not to bore you with technical details. (For the geeks out there—like me—who want such things, the endnotes have citations and comments on relevant research.) I can’t claim to answer all the will-this-work questions you might have (or even a majority of them), but I do hope to give you a jumping-off point, a way of thinking critically about impact that you can use wherever you engage with the issues of poverty—in the news, in conversation, or as a donor.

  Randomized Control Trials: Asking the Right Question

  So how exactly do we find out what works? The tool we use, called a Randomized Control Trial (RCT), is hardly cutting-edge. It’s actually about a thousand years old—much older than economics itself—and it has long been the gold standard throughout the sciences for determining effectiveness. To take one example, the Food and Drug Administration requires data from an RCT to warrant approval for new medicines. In general, if you need rigorous and systematic evidence of effectiveness on a large scale, you use an RCT to get it when you can.

  The power of an RCT lies in its ability to give an objective, unbiased picture of the impact a program has on its participants. What do we mean by impact? Simply put, measuring impact means answering (at least) one simple question: How did people’s lives change with the program, compared to how they would have changed without it?

  Often, evaluations of development programs only answer the first half of the question: How did people’s lives change with the program? That is, they measure how people were before (the “before”) and compare it to how they were afterward (the “after”). These are aptly called “before-after” evaluations.

  Before-after analyses usually aren’t very good. In fact, they can be so bad that in many cases I suggest that, rather than do a before-after, an organization should pass on evaluation altogether and just provide more services. I consider it unethical to measure impact so badly that it really does not tell you anything. That just wastes money that could have gone to better uses.

  Here’s why the before-after approach is flawed. Suppose you are conducting a study in eastern Washington in the spring of 1980 to evaluate a new treatment for respiratory infections. On the morning of Sunday, May 18, BOOM! Mount St. Helens erupts. Soon, many of the subjects of the study (who also live in eastern Washington) develop severe respiratory infections, and your before-after comparison reveals that, indeed, many more subjects had infections at the end than at the beginning. What can you conclude about the treatment you tested? Did it really cause the subjects to develop more infections, or was that a consequence of something else—like the ash from the eruption?

  The before-after approach fails when something external (like a volcanic eruption) causes a change in the outcomes we care about (like respiratory infections). In the case of Mount St. Helens, it is quite easy to identify the outside influence. But with many development programs it is difficult, if not downright impossible, to observe them all. We need something extra that lets us account for those external factors—especially when they are hard to identify.

  That something extra is a set of people who don’t get the treatment being tested, but whom we monitor anyway (called a “control group”). Any external factors that come into play should affect both the treatment and control groups equally. If they do, then we can still compare the two at the end to see the impact of treatment. In the Mount St. Helens example, suppose the number of respiratory infections tripled in the control group, but only doubled in the treatment group—then we’d know that the new treatment really did help, despite the fact that there were more respiratory infections at the end than at the outset.

  Flip a Coin for Science

  But is any control group enough? Can we simply take a set of people who were not treated and compare their outcomes to those of the treated folks? Not quite. The two groups have to be similar enough that we can make a meaningful comparison between them.

  What exactly do we mean by similar? It is easy to find people who were not part of a program. Many evaluations of development programs do just that; and that’s where they go wrong. The very fact that certain people were excluded from the program often means they are not right to use for comparison! We have to ask why they were excluded. Did they choose not to join? Did they not qualify to participate? The answers to these questions can have big implications.

  Let’s say a microfinance bank wants to evaluate a new entrepreneurial loan by giving it to some clients as a pilot. At a large group meeting, the bank’s managers describe the loan and ask for twenty volunteers to form the pilot group. Then they choose twenty of the remaining clients (who didn’t volunteer) to monitor for control. Sure enough, the pilot is a success: Those receiving the new loans make more of their payments on time and in full. Based on these results, the bank’s management concludes that the features of the new loan cause better repayment behavior. It launches the product and offers it to all clients. Many take up the new loans, but they don’t fare so well—they actually default more than before. Did the test lead them astray?

  Not necessarily. The pilot showed the difference between twenty clients who volunteered to receive—and actually did receive—the new loan, and twenty clients who did neither. Maybe those who stood up to volunteer were excited by the offer because they had good business ideas and well-developed plans to execute them. And maybe those who didn’t volunteer (some of whom e
nded up in the control group) had fewer good business ideas or were less motivated. That would help to explain why the volunteers outperformed the comparison group, even if the new entrepreneurial loan had nothing to do with it.

  Because so many development programs—especially microcredit, but others as well—seek to leverage the intangible qualities of participants, this is a common problem. When you design an evaluation, how can you be sure you don’t put all the go-getters (or all the creative ones, or all the ambitious ones, or the ones with the strongest work ethic) in one group? If traits like these were easy to identify and measure, you could just deal them out evenly between treatment and control. But they aren’t easy to identify or measure—they are hidden.

  So how do you divide people evenly based on characteristics you can’t see?

  You flip a coin for each person to decide whether she is offered a program or not. If it comes up heads, assign to treatment. If it comes up tails, assign to control.

  That’s it. That’s the big secret. The coin does the work for us. Of course, it doesn’t have any idea who the go-getters are, but on average it will send half of them to each group. And as long as the total number of individuals is large enough, the treatment and control groups will have similar people on average across all characteristics. This is true for the things we observe, like gender, age, education, as well as the things we cannot observe and verify, like entrepreneurial spirit and ambition.

  The “on average” part is important. If you flip a coin a hundred times, it should come up heads close to 50 percent of the time. Flip that same coin a thousand times and the proportion should be even closer to 50 percent (though you still probably won’t get exactly five hundred heads). The point is, flipping a coin does not guarantee a perfect split, but it gets you close—and the more coin flips, the closer you get. So it is with randomization. On average, treatment and control groups constructed by random assignment will be comparable across all characteristics, and the larger the groups, the more confident we can be about the balance.

  So now you know: An RCT is not a complicated thing. This machine, powerful enough to find out the truth about what works in the fight against poverty, does not run on Ph.D.-level mathematics. It works by using randomization to split a set of people into two groups, taking a “before” snapshot of each, giving one of the groups the program in question, and comparing “after” snapshots of both groups.

  A Hard Question for Ernest

  Now, treatment and control groups and flipping of coins may not sound as sexy as some behavioral research, but that doesn’t mean doing an RCT of a development project is boring. Far from it. The structure of an RCT requires you to get your hands dirty, to encounter poverty firsthand. You want to collect solid and consistent data for the treatment and control group snapshots? Go out and get it. RCTs take place in the field—in slums, in teeming markets, in mud huts, in rice paddies—and they work by looking at real people as they make real decisions in the real world.

  Jake and I can both say from experience that doing field research is by turns inspiring, maddening, hilarious, tragic, joyous, and mysterious; but it is always enlightening. With almost equal frequency, seemingly intractable problems are resolved in an instant, and tasks that appeared straightforward are found to be impossibly complex. There simply is no such thing as a ho-hum day in the field.

  Here is an example from a project of mine on microcredit interest rates. Jake, who was a research assistant on the project at the time, conducted an interview with a phone card salesman in Ghana while pilot-testing a survey.

  Ernest was sitting in the shadow of a yellow umbrella. The dusty sidewalk shone, bleached by the bright white sunlight, and the frontier of the shadow cut a sharp edge against it. The umbrella was anchored by a small wooden cabinet painted bright yellow. On its top were a foolscap notebook, a ballpoint pen, and two mobile phones.

  Jake ducked his head under the rim of the umbrella and greeted him. “Good afternoon, sir.”

  “Yes, sir, good afternoon to you too.”

  “Sir, my name is Jake. Today I am doing a survey to learn about the businesses in this area and about their owners. Do you mind if I ask you a few questions about your phone card business?”

  “Oh, that will be fine, Jake. My name is Ernest.”

  Jake began with the first survey question, and soon came to the fifth question. “Ernest, how many people belong to your household? By that I mean: How many are you that share a single living space and take meals together?”

  Ernest didn’t waste any time. “Oh, that is just me, sir.”

  “I see. So you live alone?”

  “Oh, no, sir. I have a wife and three children. But myself, I wouldn’t eat with them. My wife brings my food to me alone.”

  “Ah. But normally your wife cooks for the whole family.”

  “Yes. She will prepare the stew and the fufu for all.”

  “So for how many people does your wife prepare food each evening?”

  “That is”—and Ernest counted silently on his fingers—“eight.”

  “Eight. So it is yourself, your wife, your three children, and three others. Who are the other three?”

  “Hm. They are my grandmother and my wife’s sister.” He cocked his head and waited.

  “Well, that sounds like two.”

  “Yes.”

  “So that makes seven altogether: you, your wife, your three children, your grandmother, and your wife’s sister.”

  “Yes, we are seven. And also the sister’s children. They are two.”

  “Oh, so seven and the two children—nine in all?”

  “Yes.”

  “And your wife’s sister, is she married?”

  “Yes, she has husband.”

  “And does he join you for meals most days?”

  “No, he stays with his family at the Central Region.”

  “I see. But what about his wife and two children you mentioned? Do they live at your house?”

  “No. They are with him.”

  “Oh. I thought you said they normally share meals with your family.”

  “Yes, we have been eating together.”

  “I’m afraid I don’t understand. Your wife’s sister and her two children—how can they live in the Central Region and also normally share meals with you?”

  “Oh, Jake! They have come to stay with us.” Ernest was smiling. Maybe he was thinking of his full house.

  “Are they just visiting, or do they live in the house with you?”

  “Oh, no, they don’t live there. They have only been staying for a very short time.”

  “Okay. So how long have they been with you?”

  “They came around the Christmas season.”

  It was July.

  What We Talk About When We Talk About Poverty

  Spend some time doing this kind of fieldwork—in sprawling, chaotic urban centers, in impossibly dense favelas climbing up steep hillsides, in tiny villages perched on the edge of a cliff, places that are accessible only by ancient rusty buses, or by gutted vans with bench seats made of bare wood planks, or by foot—and you pretty quickly stop talking about “fighting poverty” in lazy metaphors. Poverty is not a shackle that can be broken, not a tumor that can be excised, not a millstone that can be shattered, not a choking vine that can be clipped. Or at least seeing it in those ways accomplishes nothing.

  Here’s what the UN says about it: “Fundamentally, poverty is a denial of choices and opportunities, a violation of human dignity. It means lack of basic capacity to participate effectively in society.” This may be entirely true and accurate. But is it useful?

  When we articulate the problems of poverty in these terms, we’re bound to find solutions that traffic in the same currency. Witness the recent emphasis on “sustainable” programs—ones that, after an initial period of oversight and external funding, become self-sufficient and even self-propagating.

  The case for sustainability is often explained with a flourish of Chine
se proverb: “Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for life.”

  This gets donors and socially minded investors excited. People prefer to give a hand up rather than a handout. That makes sense: Instead of giving fish to the poor, let’s give them rods and reels and lessons about casting. Then we won’t have to provide fish in perpetuity. Outfitted with equipment and training, they will be able to eat long after we leave. What could possibly go wrong?

  The teach-a-man-to-fish approach has been around for decades. The results have not been as universally great as one might hope. For natural-born fishermen, it can work. But the problem is that some people are bad at baiting the hooks; some can’t cast worth a damn; some have arthritis and can’t grip the reel to haul in a catch; and some don’t live near a river with enough fish in it. Some people think fishing is just plain boring. Come dinner, all these folks are out of luck. They can’t eat rods and reels and lessons about casting. So what can this kind of development do for them?

  Up in the realm of high-minded concepts and metaphor—choices, opportunities, dignity, fishing—the air is thin and there are no actual poor people to be found. This isn’t where development needs to be. It needs to be on the ground. If we want to solve poverty, we need to know what it is in real—not abstract—terms. We need to know how it smells, tastes, and feels to the touch.

  And maybe this is why it’s such a hard thing to grasp: Poverty doesn’t have many positive sensory attributes, because to be poor means not to have things, in the most immediate sense. It means not having enough food, not having shelter, not having access to clean water or to essential medicines when you’re sick. The day-to-day experience of being poor is about lacking day-to-day necessities. It’s about not being able to get the things you need.

 

‹ Prev