Reign of Error: The Hoax of the Privatization Movement and the Danger to America's Public Schools
Page 13
Other economists said it might take four great teachers in a row, or even five great teachers in a row, to close the gaps, but the reformers usually preferred to stay with the claim of three years. (Of course, if all children have a great teacher, and all children are making the same gains, the achievement gap won’t close, but that’s another issue.) The reformers often repeat the claim that three “great” or “effective” teachers in a row would close the test score gap between black and white children, between rich and poor children, between Hispanic and white children. Michelle Rhee cited this supposed finding many times; she said, for example, at the University of Southern California in 2011: “We know for poor minority children, if they have three highly effective teachers in a row, versus three ineffective teachers in a row, it can literally change their life trajectory.”3
Arne Duncan often said something similar: “Three great teachers in a row, and the average child will be a year and a half to two grade levels ahead. Three bad teachers in a row, and that average child might be so far behind they might never catch up.”4
A variation on the same theme is that a great teacher produces three times as much learning in a year as a poor teacher. Or, put another way, the students of the great teacher get a test score gain of eighteen months in a year, while the students of the poor teacher learn only six months’ worth of whatever they studied in a year. The Stanford economist Eric Hanushek wrote in 2010 that the difference in effectiveness among teachers “is truly large, with some teachers producing 1½ years of gain in achievement in an academic year while others with equivalent students produce only ½ year of gain. In other words, two students starting at the same level of achievement can know vastly different amounts at the end of a single academic year due solely to the teacher to which they are assigned. If a bad year is compounded by other bad years, it may not be possible for the student to recover.”5 Perhaps such “great” teachers exist, but there is no evidence that they exist in great numbers or that they can produce the same feats year after year for every student.6
In the fall of 2010, the documentary film Waiting for “Superman” popularized the idea in the national media that American public education was in desperate condition because there were so many bad teachers in the schools. At the same time, a group of urban superintendents led by Joel Klein and Michelle Rhee published a “manifesto” about how to fix the schools, which asserted: “So, where do we start? With the basics. As President Obama has emphasized, the single most important factor determining whether students succeed in school is not the color of their skin or their ZIP code or even their parents’ income—it is the quality of their teacher.”7 Their manifesto asserted that teachers’ credentials, experience, and education were irrelevant in judging their quality. The only thing that matters, they argued, is “performance,” meaning the test scores of their students.
Klein and Rhee misquoted President Obama. President Obama had said that “the biggest ingredient in school performance is the teacher. That’s the biggest ingredient within a school. But the single biggest ingredient is the parent.”8 Richard Rothstein described the Klein-Rhee manifesto as a “caricature” and added:
Decades of social science research have demonstrated that differences in the quality of schools can explain about one-third of the variation in student achievement. But the other two-thirds is attributable to non-school factors … What President Obama means is that if a child’s parents are poorly educated themselves and don’t read frequently to their young children, or don’t use complex language in speaking to their children, or are under such great economic stress that they can’t provide a stable and secure home environment or proper preventive health care to their children, or are in poor health themselves and can’t properly nurture their children, or are unable to travel with their children or take them to museums and zoos and expose them to other cultural experiences that stimulate the motivation to learn, or indeed live in a zip code where there are no educated adult role models and where other adults can’t share in the supervision of neighborhood youth, then children of such parents will be impeded in their ability to take advantage of teaching, no matter how high quality that teaching may be.9
Social scientists generally agree that students’ families (especially family income, which determines advantages and opportunity) have an even bigger impact on student performance than their school or teachers. According to some economists, family accounts for about 60 percent of the variation in test scores; the school (its leadership, its staff, its resources, its programs, and such matters as the presence or absence of peer effects, that is, the presence or absence of willing students) is responsible for about 20–25 percent of the variation. Within the 20–25 percent attributable to the school, teachers are the biggest component affecting how students perform on tests, possibly as much as 15 percent. President Obama accurately said that the teacher matters most within the school, but “the biggest ingredient” in students’ academic performance is their family.10 (Personally, I am skeptical about these precise statistical calculations about large and complex human activities, but I am not an economist, so what do I know?)
Yet the myth persists that the teacher is primarily responsible for student scores and that great teachers can overcome the influence of family, poverty, disability status, language proficiency, and students’ own levels of interest and ability. Certainly, there are many people whose lives were changed by one teacher, but their stories typically describe teachers who were unusually inspiring, not “the teacher who raised my test scores to the top.” Teachers do have the power to change lives. But after more than a decade of No Child Left Behind, researchers are still searching for a nonselective school or a district where every student, regardless of his or her starting point, has achieved proficiency on state tests because that school or that district has only effective teachers.
Despite the absence of evidence, the claims persist. On its Web site, Michelle Rhee’s organization StudentsFirst says, “Research shows that a highly effective teacher generates 50% more learning than an average teacher. Conversely, an ineffective teacher generates 50% less learning than an average teacher. This means that kids learn three times more in a highly effective teacher’s classroom than in an ineffective teacher’s classroom.”11 Presumably, if a school hired and retained only those highly effective teachers, there would be dramatic gains in student test scores for all students. But Rhee doesn’t seem to understand that very few teachers get the same high test score gains year after year. In 2012, Melinda Gates said in a television interview, “An effective teacher in front of a student, that student will make three times the gains in a school year that another student will make.”12 She said that the job of the Gates Foundation is “to make sure we create a system where we can have an effective teacher in every single classroom across the United States.”
Given the reformers’ conviction that the teacher is the key to raising test scores dramatically for every student, they had to find a strategy to identify those highly effective teachers and get rid of those who didn’t have the right stuff.
In his academic studies and in Waiting for “Superman,” Eric Hanushek proposed that public schools should fire 5–10 percent of the teachers whose students got the lowest scores. If that happened, he said, the United States would rise nearly to the top of international test rankings. Moreover, he argued, replacing those bottom-of-the-barrel teachers with average teachers would add trillions of dollars to the nation’s gross national product. He wrote:
U.S. achievement could reach that in Canada and Finland if we replaced with average teachers the least effective 5 to 7 percent of teachers, respectively. Assuming the lower-bound estimate of teachers’ impact, U.S. achievement could reach that in Canada and Finland if we replaced with average teachers the least effective 8 to 12 percent of teachers, respectively …
Closing the achievement gap with Finland would, according to historical experience, have astounding benefits, increasing the annual growth rate
of the United States by 1 percent of GDP. Accumulated over the lifetime of somebody born today, this improvement in achievement would amount to nothing less than an increase in total U.S. economic output of $112 trillion in present value. (That was not a typo—$112 trillion, not billion.)13
Hanushek suggested that there were three ways to get this dramatic improvement in teacher quality. One was to recruit higher-caliber teachers; another was to improve the skills of current teachers. But he maintained that both these methods had been tried and found inadequate. Instead, he recommended “deselection” of the bottom teachers based on their performance, defined as the test scores of their students. But school districts and states would need to change their policies, he believed, to attract and retain the kinds of teachers who could produce amazing test scores:
They would need recruitment, pay, and retention policies that allow for the identification and compensation of teachers on the basis of their effectiveness with students. At a minimum, the current dysfunctional teacher-evaluation systems would need to be overhauled so that effectiveness in the classroom is clearly identified. This is not an impossible task. The teachers who are excellent would have to be paid much more, both to compensate for the new riskiness of the profession and to increase the chances of retaining these individuals in teaching. Those who are ineffective would have to be identified and replaced. Both steps would be politically challenging in a heavily unionized environment such as the one in place today.
Although Hanushek is associated with the Hoover Institution at Stanford, his views were embraced by the Obama administration’s Race to the Top program and lauded by Republican governors across the nation, such as Scott Walker in Wisconsin, John Kasich in Ohio, Mitch Daniels in Indiana, Jeb Bush and his successor, Rick Scott, in Florida, and Chris Christie in New Jersey. Even Democratic governors like Dannel Malloy in Connecticut and Andrew Cuomo in New York endorsed the belief that low test scores were caused by “bad” or “ineffective” teachers, not by poverty and not by the relationship between resources and student needs.
Hanushek’s theory that test scores will improve by “deselecting” teachers whose students receive low test scores got a huge boost in 2012 with the highly publicized release of a study by the economists Raj Chetty and John N. Friedman of Harvard University and Jonah E. Rockoff of Columbia University. The Chetty study reviewed the records of students and teachers in the 1990s, before the advent of high-stakes testing, and concluded that students who had an effective teacher for a single year would have higher lifetime earnings and other benefits. The study was announced on the front page of The New York Times, where one of the authors said, “The message is to fire people sooner rather than later.” The study said that replacing a poor teacher with an average teacher would raise a single classroom’s lifetime earnings by $266,000.14 President Obama was so impressed by the Chetty study that he referred to it a few weeks later in his State of the Union address, saying, “We know a good teacher can increase the lifetime income of a classroom by over $250,000.”
However, critics were quick to raise questions about the study. They said that the authors may have confused correlation with causation (a class that gets higher test scores is also likelier to go to college and earn more) and that a large-scale study cannot pinpoint the effects of individual teachers. More than one critic pointed out that a lifetime gain of $266,000 for a class of twenty-six children, engaged in the labor force for forty years, translated to about $250 a year, or $5 a week. It would be even less for a larger class. As Bruce Baker observed, “What this boils down to is that a student can get a lifetime boost of $5 a week if we now spend billions of dollars on value-added rating systems. Maybe. Or maybe not.”15
None of the enthusiasts of value-added assessment recognized that nations at the top of the international league tables did not get there by “deselecting” teachers whose students got low test scores. Nations such as Finland, Canada, Japan, and South Korea spend time and resources improving the skills of their teachers, not selectively firing them in relation to student test scores.
Nonetheless, what entered the reform lexicon was a fixed belief that bad teachers must be found out and fired.
But then came the knotty problem: How can a school district measure teacher quality? How can district leaders know which teachers should get bonuses and which should be fired? The only way to answer these questions, reformers believe, is to collect test scores every year and then see which teachers got those big gains and which ones didn’t. Then rank the teachers from top to bottom. Once the ranking is done, according to reform theory, the teachers whose students got the big gains get bonuses and the ones whose students got no gains get fired. Eventually, if this is done consistently, the district ends up with only great teachers.
Some districts and states have already collected enough data to rank teachers by the test score gains of their students. Whether the rankings are accurate or not, some teachers have gotten bonuses and some have been fired. But no district has yet demonstrated the reformers’ thesis that firing teachers based on student test scores will bring about great increases for the district. Despite the oft-repeated claims by reformers that three years in a row of great teachers will close the gap, no school district has ever done it, not even districts with a superintendent and school board fully supportive of the corporate reform faith and without a teachers’ union to stand in the way. It remains a theory based on speculation, not evidence.
One reason it is hard to prove the theory is that the ratings are unstable from year to year. A teacher may be rated effective one year but ineffective the next. And the fact that the top-rated teachers produce gains large enough to close the achievement gap in three to five years doesn’t necessarily matter much if you cannot identify the teachers who have this impact year after year. Only a small proportion of teachers gets big test score gains year after year, so it may be difficult to find enough of them to staff an entire school, let alone an entire school district. As Matthew Di Carlo of the Shanker Institute has pointed out, “Because of the imprecision of these growth models, various sources of bias, and year-to-year variation in students and conditions, very few teachers manage to be ‘top’ teachers for three, four or five consecutive years. A huge chunk of the ‘top’ teachers in year one are average—or even below average—in year two. Even more of them fall out of the ‘top’ bracket in the third, fourth, and fifth years.”16
Another reason it is hard to prove the theory is that teachers are not factory workers who can be shifted from spot to spot as if they were on an assembly line. The teacher who is highly effective in one school may not be equally effective in another. But we can’t know for sure, because no one has tried to move teachers around to prove the theory that three great teachers in a row will close the achievement gap for an entire school or district. Not yet, anyway.
While it seems certain that some teachers are excellent and others are not, the theory is based on some wobbly claims. The very concept of value-added assessment reflects the mind-set of statisticians and economists who measure productivity gains. A farmer plants corn of a certain variety in a certain type of soil, treats it with certain conditions, and then measures the growth of the crop to determine the worthiness of the treatment. In the context of value-added assessment, the teacher is the treatment. If the teacher is effective, the corn grows to a certain height. If the teacher is not, the corn does not grow or grows very little.
But children are not corn. They are not seeds or plants with fixed characteristics. Children’s lives are not static. They have crises and ups and downs in their home lives and their personal lives. Maybe their parents got divorced. Maybe a parent lost her job. Maybe a student broke up with her boyfriend or totaled the family car. Maybe a family member died. Maybe the family moved to a new home. Maybe they were evicted from their home. These changes affect motivation, attention, and school performance. Children are not crops. They are not empty vessels waiting to be filled by a teacher.
I
n addition, the conditions for the teacher do not remain static. There may be more or fewer high-scoring students assigned to the teacher’s class. Class size may increase because of state budget cuts. The curriculum and instructional materials may be better or worse this year. The school leader may change and be more or less supportive. Valued colleagues may retire. The school climate may be tranquil or disruptive. Any number of changes in the school may affect the teacher’s classroom, the availability of resources and support, and ultimately the test scores of students.
The problems with value-added assessment are legion. Students are not randomly assigned, so teachers face different challenges every year. An excellent teacher may have a highly motivated group of students one year, while an equally effective teacher may be assigned a class with two or three troublemakers, who disrupt the class. Some teachers are deliberately assigned high-performing or low-performing students, or choose to teach one group or the other. One teacher gets great results, the other does not, but they faced different challenges, and the comparison is unfair.
The American Educational Research Association (AERA) and the National Academy of Education (NAE) prepared a joint statement about the problems with value-added assessment. They found that students’ test scores are influenced by far more than their teacher, and the various statistical models don’t account for all these factors. The other factors include:
• school factors such as class sizes, curriculum materials, instructional time, availability of specialists and tutors, and resources for learning (books, computers, science labs, and more)
• home and community supports or challenges
• individual student needs and abilities, health, and attendance