by Yong Zhao
16. Y. Ye, “Xiaoshengchu jibian” [Distorted middle school matriculation], 21st Century Education Research Institute. September 28, 2013, http://www.21cedu.org/index.php?m=content&c=index&a=show&catid=124&id=3320&page=4.
17. X. Li, “Yinxing he bianxiang kaoshi cheng xiaoshengchu zhuliu, jiujin ruxue buzhi suozhong” [Hidden and alternative exams become mainstream for secondary school matriculation. Policies regarding residence-based enrollment ended without effect], China Youth, December 11, 2013, http://news.xinhuanet.com/edu/2013–12/11/c_125838505.htm.
18. D. Jiao, “Jinling shi ruhe shixiao de” [How the ban lost its effectiveness], China Weekly, September 17, 2013, http://www.chinaweekly.cn/bencandy.php?fid=63&id=6847.
19. Ministry of Education, “Jiaoyubu guanyu jianqing zhongxiao xuexiao xuesheng guozhong fudan de zhishi” [Ministry of Education's instructions to reduce the excessive burden on primary and secondary school students], July 1, 1955, http://blog.sina.com.cn/s/blog_576b52430102e3mc.html.
20. “China Enters ‘Testing-Free’ Zone: The New Ten Commandments of Education Reform,” blog entry by Yong Zhao, August 22, 2013, http://zhaolearning.com/2013/08/22/china-enters-%E2%80%9Ctesting-free%E2%80%9D-zone-the-new-ten-commandments-of-education-reform/.
21. P. Li, “Zuiyan jianfuling zhuanyi fudan dao xiaowai, fudaoban huobao jiaofu maiduanhuo” [The strictest decree to alleviate burden has transferred the burden after school, tutoring market exuberant and teaching materials sold out], Guangming Wang, March 25, 2013, http://edu.gmw.cn/2013–03/25/content_7158015_3.htm.
22. “Nash Equilibrium,” Investopedia, http://www.investopedia.com/terms/n/nash-equilibrium.asp.
23. G. Hardin, “The Tragedy of the Commons,” Science 62 (1968):1243–48.
24. Ibid.
8
The Naked Emperor: Chinese Lessons for What Not to Do
While China flails, stuck in its Bread and Butterfly dilemma, the rest of the world, oblivious to the struggle, watches in awe. To them, China is an idol. Envy, terror, and genuine admiration—or a mixture of all three—color most outsiders' response to the ancient, gigantic dragon. Its miraculous economic growth, its rapid ascendance on the global political stage, its explosive growth in patent applications and scientific paper output, and its stunning performance on international tests: all of these triumphs seem to suggest that China has found its way to economic growth without following the Western liberal democratic path. At a time when Western democracies are experiencing economic slowdowns and political chaos, the apparent efficiency of China's authoritarian government is an attractive model: an alternative to Western democracy, a threat to the West's complacent superiority, and a fresh source of political, economic, and intellectual inspiration.
But while China's achievements over the past thirty years are laudable, it's a bit premature to declare authoritarianism a victory over democracy. The Chinese economic miracle is not the result of intensified authoritarian control. Rather, it's an involuntary, pragmatic retreat from a rigid totalitarian regime. Suddenly a large population, previously deprived of any autonomy, had the freedom to conduct their daily economic life as they wished. Albeit still very limited, that freedom was enough to enable them to take advantage of an increasingly globalized economy. Alas, as the massive uneducated, cheap, and highly motivated labor force dries up and the world economy changes, China's miracle faces the inevitable challenge of upgrading.
The upgrade will require a different workforce, one that is diverse, creative, and entrepreneurial. But despite 150 years of effort, China has failed to develop an education capable of cultivating such a workforce. Since its humiliating encounters with Western powers in the mid-1800s, China has been on a hesitant journey to develop its people's capacity for scientific and technological innovation. But due to its reluctance to move away from authoritarianism, China's educational philosophy and practice remain as incapable of producing creative and innovative talent as they were two centuries ago. Yet the authoritarian nature of Chinese education has proven extremely effective—at producing great test takers. And so, in a world captivated by test scores, Chinese education rises as a shining, overestimated example.
Like its economic accomplishment, China's educational achievement is remarkable and respectable. But promoting Chinese education as the world's best is both scientifically inaccurate and philosophically misleading. Much of the world's admiration rests on a simplistic definition of education quality, a romanticized interpretation of the factors contributing to the system's success and an unquestioning glorification of its authoritarian approach. This misplaced admiration leads to China's elevation as a model for the rest of the world, particularly Western countries such as Australia, the United Kingdom, and the United States. While the admiration may be innocent, putting China on a pedestal is dangerous. At best, it will lead to a waste of time and resources, as other countries struggle to copy a model that was proven obsolete over a century ago. At worst, it will destroy the creative Western educational systems that China has been so eager to copy for more than a century. “Chinese education would be a poison for America, not a remedy,” warns Saga Ringmar, a Swedish high school student who attended a Shanghai school for two years.1
Illusions of Excellence and Equity
The evidence to support China's excellence in education is embarrassingly thin, but it's been well marketed. Given the widespread acceptance of China's peerless status in the world of education, it is mind-boggling to realize that the primary evidence is simply two sets of test scores in three subjects from one source.
China was made the world's model of educational excellence by the Program for International Student Assessment (PISA), the triennial test of fifteen-year-old students in math, reading, and science operated by the Organization for Economic and Cooperative Development (OECD). PISA has become the star maker in the education universe because of its bold claim to assess “the extent to which students near the end of compulsory education have acquired key knowledge and skills that are essential for full participation in modern societies.”2 Moreover, PISA claims to find educational stars by identifying which education systems better prepare their children for “full participation in modern societies” as measured by PISA scores. The goal is for educational systems to learn from “the highest-performing and most rapidly improving school systems.”3
China was found to have the “strongest-performing” school system in December 2011, when the 2009 PISA results were publicized. Students from Shanghai topped the rankings in all three subjects. A new star had been discovered! The OECD promoted its star with press releases, interviews, blog posts, publications such as Strong Performers and Successful Reformers in Education: Lessons from PISA for the United States, and an elaborate video series produced by the Pearson Foundation (in collaboration with PISA and OECD).4 Shanghai was now officially an education giant, declared the National Center on Education and the Economy (NCEE), a nonprofit education policy think tank in the United States, in a paper, Standing on the Shoulders of Giants: An American Agenda for Education Reform, later expanded to a book, Surpassing Shanghai: An Agenda for American Education Built on the World's Leading Systems.5 In addition, a host of reports by international media and awestruck comments by government leaders such as US President Barack Obama and Secretary of Education Arne Duncan gave further credence to China's newly acquired reputation.
It is worth pointing out that none of the publications, media reports, and commentaries provided more empirical evidence. Instead, they simply accepted the illusion of excellence and strengthened it by attempting to explain its causes. The only additional evidence was the same top ranking in the 2012 PISA. Shanghai's fifteen-year-olds again ranked first in all three categories. More promotion and publicity followed, including a report compiled by the NCEE entitled Chinese Lessons: Shanghai's Rise to the Top of the PISA League Tables.6
The unsuspecting public, along with many national policymakers, have been sold the notion that PISA measures the quality of educational sys
tems; therefore China's is the best. Yet the sole piece of evidence that supports China's status has been critiqued relentlessly. In a June 2013 article in the Times Education Supplement magazine, William Stewart raised a barrage of questions:
But what if there are “serious problems” with the Pisa data? What if the statistical techniques used to compile it are “utterly wrong” and based on a “profound conceptual error”? Suppose the whole idea of being able to accurately rank such diverse education systems is “meaningless”, “madness”?
What if you learned that Pisa's comparisons are not based on a common test, but on different students answering different questions? And what if switching these questions around leads to huge variations in the all-important Pisa rankings, with the U.K. finishing anywhere between 14th and 30th and Denmark between fifth and 37th? What if these rankings—that so many reputations and billions of pounds depend on, that have so much impact on students and teachers around the world—are in fact “useless”?7
The article's findings are troublesome to PISA and should be extremely unsettling to its faithful, say scholars who have independently reached the same conclusions. “As far as they are concerned, the emperor has no clothes,” writes Stewart. Citing numerous publications and conversations with scholars in Denmark, Northern Ireland, and the United Kingdom, as well as with OECD, he points out major technical flaws with PISA's composition of the tests, administering of the tests, and use of statistical techniques to generate country rankings. Stewart uses research by Svend Kreiner, a professor of biomedical statistics at the University of Copenhagen, to point out that the PISA rankings are fundamentally flawed because not all students in each country responded to the same questions. “For example, in Pisa 2006, about half the participating students were not asked any questions on reading and half were not tested at all on maths, although full rankings were produced for both subjects,” he writes. Moreover, students in different countries were asked different sets of questions. “Eight of the 28 reading questions used in Pisa 2006 were deleted from the final analysis in some countries.”
Kreiner presents a more serious challenge to PISA. He questions the appropriateness of the model PISA uses to produce the country rankings. PISA uses the Rasch model, a widely used psychometric model named after the late Danish mathematician and statistician Georg Rasch. For this model to work properly, certain requirements must be met. But according to Kreiner, who studied under Rasch and has worked with his model for forty years, PISA's application does not meet those requirements. In an article published in the academic journal Psychometrika, Kreiner and coauthor Karl Bang Christensen show that the Rasch model does not fit the reading literacy data of PISA, and thus the country rankings are not robust. As a result, rankings of countries can vary a great deal over different subsets. For example, Denmark can rank anywhere between fifth and thirty-sixth out of fifty-six countries.8 “That means that [PISA] comparisons between countries are meaningless,” Kreiner told the Times Education Supplement.
Kreiner is not the first or only scholar to raise questions about PISA's technical flaws. In 2007, a collection of nearly twenty researchers from multiple European countries presented their critical analysis in the book PISA According to PISA: Does PISA Keep What It Promises?9 Independent scholars from all over the world took apart PISA's methodology, examining how it was designed; how it sampled, collected, and presented data; and what its outcomes were. Then the researchers compared the test's real-life validity to its claims. Almost all of them “raise[d] serious doubts concerning the theoretical and methodological standards applied within PISA, and particularly to its most prominent by-products, its national league tables or analyses of school systems.”10 Among their conclusions were these:
ISA is by design culturally biased and methodologically constrained to a degree that prohibits accurate representations of what actually is achieved in and by schools. Nor is there any proof that what it covers is a valid conceptualization of what every student should know.
The product of most public value, the national league tables, are based on so many weak links that they should be abandoned right away. If only a few of the methodological issues raised in this volume are on target, the league tables depend on assumptions about the validity and reliability which are unattainable.
The widely discussed by-products of PISA, such as the analyses of “good schools,” “good instruction” or differences between school systems…go far beyond what a cautious approach to these data allows for. They are more often than not speculative.11
PISA did respond to some of the technical challenges. For example, Andreas Schleicher, PISA's face to the world, wrote a commentary responding to Kreiner's charges in TES.12 While the dispute over PISA's technical flaws continues, some argue that even if PISA did everything right technically, it still could not possibly claim to be measuring the quality of entire education systems, let alone their students' ability to live in the modern world.
“There are very few things you can summarise with a number and yet Pisa claims to be able to capture a country's entire education system in just three of them,” wrote Hugh Morrison of Queen's University Belfast in Northern Ireland. “It can't be possible. It is madness.”13 Morrison, a mathematician, does not think the Rasch model should be used at all. He argues that “at the heart of Rasch, and other similar statistical models, lies a fundamental, insoluble mathematical error that renders Pisa rankings ‘valueless’ and means that the programme ‘will never work.’ ”14 The problem of PISA, according to Morrison, violates a central principle of measurement drawn from physicist Niels Bohr's work: the entity measured cannot be divorced from the measuring instrument. Morrison illustrates his point with an example. Suppose Einstein and a student both produced a perfect score on a test. “Surely to claim that the pupil has the same mathematical ability as Einstein is to communicate ambiguously?” The unambiguous communication would be “Einstein and the pupil have the same mathematical ability relative to this particular [test]…Mathematical ability, indeed any ability, is not an intrinsic property of the individual; rather, it's a joint property of the individual and the measuring instrument.”15 In a nutshell, Morrison's point is that PISA scores students' ability to complete tasks included in the test, not their general ability to understand and succeed.
Even if PISA did measure cognitive abilities as accurately as it claims to, those abilities span only three domains: math, reading, and science. PISA makes the assumption that these skills are universally valuable. In other words, as Svein Sjøberg, a professor of science education at Norway's University of Oslo, points out, PISA “assumes that the challenges of tomorrow's world are more or less identical for young people across countries and cultures” and thus promotes “kind of universal, presumably culture-free, curriculum as decided by the OECD and its experts.” This assumption is mistaken. He continues, “Although life in many countries do [sic] have some similar traits, one can hardly assume that the 15-year olds in e.g. Japan, Greece, Mexico and Norway are preparing for the same challenges and need identical life skills and competencies.”16
Even if cognitive skills in math, science, and reading were the most important skills in the universe, they would not—could not—be the only skills an educational system should cultivate. Skills and knowledge in other domains, such as “the humanities, social sciences, foreign languages, history, geography, physical education etc.,” play a crucial role if citizens of any country are to live a fulfilling life.17 So do noncognitive skills: social-emotional skills, curiosity, creativity, resilience, engagement, passion, and a host of other personality traits. In fact, many would argue that talents, skills, knowledge, and creativity in domains outside math, science, and reading are at least as important, perhaps even more important, to live successfully in the new world. Henry Levin, a professor in economics of education at Teachers College, Columbia University, reviews empirical evidence that shows the essential value of noncognitive skills to work and life in his article “More Than Just
Test Scores.”18
PISA provides no direct evidence of Chinese students' performance in areas beyond math, science, and reading. Thus, even if PISA were methodologically sound, conceptually correct, and properly administered, its only unambiguous conclusion would be that fifteen-year-old students in Shanghai received the highest scores in math, reading, and science in 2009 and 2012. Leaping from the highest PISA score in three subjects to the best education system in the world is too big a jump for any logical person—unless the purpose of education is defined as doing well on the PISA.
Since no one, not the Chinese and not even the PISA team (I hope), would define the purpose of education as achieving good PISA scores, making China the world's model of educational excellence just because some of its fifteen-year-olds received the highest PISA scores is not only inaccurate but misleading. The excellence is a simple illusion created by the PISA league tables.
PISA's operators refuse to have their shiny new star tarnished. In response to doubts about Shanghai's performance, Andreas Schleicher put out a forceful defense in a blog post in 2013.19 He dismissed critics as narrow-minded, jealous individuals with petty ideas: “Whenever an American or European wins an Olympic gold medal, we cheer them as heroes. When a Chinese does, the first reflex seems to be that they must have been doping; or if that's taking it too far, that it must have been the result of inhumane training.”
Schleicher countered charges that the Shanghai sample did not represent children of migrant workers; reiterated that students were not only good at memorization but could also apply their knowledge in math; and stressed that students in Shanghai have more productive beliefs than students in other countries. Schleicher's arguments about the sampling remain controversial. Tom Loveless of Brookings Institute challenged him with more data and evidence, to which PISA has yet to provide an adequate response.20 Schleicher's statement about Shanghai students' math performance does no more than simply affirm that Shanghai students are the best PISA performers in math. It does not add any more proof that Shanghai has the best education. And his point about Shanghai students' belief that “they will succeed if they try hard and they trust their teachers to help them succeed” does not add proof either. Instead, it confirms that their PISA performance is a result of “inhumane training” and exemplifies Schleicher's and his like-minded observers' attempts to romanticize the insufferable reality Chinese parents and students experience daily.