Idiocy for All and the Rise of International Large Scale Educational Assessments

By Gustavo E. Fischman, Amy Topper, Iveta Silova

Almost any education-related topic seems to turn into an overheated debate, provoking very strong gut reactions and diminishing any hope for productive discussions that engage in careful analysis of contrasting perspectives and forms of evidence. This is certainly the case with International Large Scale Educational Assessments (ILSEAs), like PISA or TIMSS, which lack nuanced discussions and methodic analyses of their role in improving student achievement.

When reading major publications about the latest results of one of the many ILSEAs – research articles, newspapers, or blogs – it is clear that these assessments are often linked to the search for answers to various educational problems. However, there is little consensus among stakeholders about the policy value and relevance of ILSEAs.  In particular, are the results of ILSEAs being used by policy-makers to revise, plan, and execute educational reforms? What changes in national education policies and practices, if any, have been made in countries as a result of ILSEAs?

To answer these questions, we analyzed over a hundred research articles that narrowly focused on these questions and surveyed 90 scholars asking the same questions. Our research shows that it is almost impossible to establish any causal or direct relationship between ILSEAs and changes in educational policies. Nonetheless, we found very strong arguments made by researchers, academics, and policymakers asserting the existence of a direct relationship, although a caveat is needed. Some of the studies found a positive or beneficial relationship between ILSEAs and changes in educational policies, while others saw a negative relationship. (We will get back to the results of our study shortly).

As we already said, we should not be surprised by the polarization of our results. Politicians, researchers, teachers, administrators, students, and their families have very strong opinions and perspectives about what works in education, what needs to be fixed, and what the “fix” should be. Each of these stakeholders attacks the other using several arguments, but two of the most common are “You are an idiot; everybody agrees with my idea, which is just good common sense” coupled with a dismissive comment, “your idea lacks any evidence, and even if you have some, it is not as strong as mine.” In fact, it seems that when discussing education, the tendency to be idiotic is quite common, and in many cases proudly so. By “idiotic,” we are not referring to the common usage of somebody who is not very clever, but in the original meaning of the word in Ancient Greek, as Walter C. Parker (2005) explains:

…idiocy shares with idiom and idiosyncratic the root idios, which means private, separate, self-centered – selfish. “Idiotic” was in the Greek context a term of reproach. When a person’s behavior became idiotic – concerned myopically with private things and unmindful of common things – then the person was believed to be like a rudderless ship, without consequence save for the danger it posed to others. This meaning of idiocy achieves its force when contrasted with politës (citizen) or public. Here we have a powerful opposition: the private individual versus the public citizen (p. 344).

To understand the extent to which educational stakeholders are exhibiting idiotic attitudes towards ILSEAs, and education reforms more broadly, one would first need to examine the discourse around, and reactions to, ILSEAs and their results. Research on ILSEAs has primarily focused on student performance and disparities in outcomes by gender and socioeconomic status, with more limited research on stakeholder attitudes. Our recent research (reference), funded by the Open Society Foundation, sought to fill this gap by looking specifically at whether national-level educational stakeholders (e.g., ministries of education, national policymakers, other national political and social actors) value these types of international measures of student attainment and to what extent they have integrated ILSEAs into their work at the national level.

ILSEAs as Tools of Legitimation

Our exploratory review of the ILSEA literature found that policymakers appear to be using these assessments as tools to legitimate existing or new educational reforms, although there is little evidence of any positive or negative causal relationship between ILSEA participation and reform implementation. That is, educational reform efforts have often already been proposed or underway, and policymakers use ILSEA results as they become available to argue for or against new or existing legislation.

At the same time, results from our two surveys of ILSEA experts, policymakers, and educators pointed to a growing perception among respondents that ILSEAs are having an effect on national educational policies, with 38% of all survey respondents stating that ILSEAs were generally misused in national policy contexts. Interestingly, experts are generally more critical in their assessment of ILSEAs compared to non-experts, with 43% arguing that ILSEAs are often being misused.  They explain that policymakers have little understanding of ILSEAs and use them for “ceremonial effects,” while at the same time arguing that these assessments are too broad and decontextualized to be used meaningfully in national contexts. Based on their professional and personal experiences, respondents were divided over whether ILSEAs actually contribute or hinder national education reform efforts.

Figure 1. Survey respondents’ perceived impact of ILSAs/GLMs on national and global education policies

Note: Includes responses from both the expert and non-expert surveys.

Perhaps the most significant finding associated with the use of ILSEAs in the literature we reviewed is the way in which new conditions for educational comparison have been made possible at the national, regional, and global levels. Arising from large-scale international comparisons, these new conditions have given rise to many myths about education – whether presumed poor performance of all public schools is due to teacher (in) effectiveness, or the relevance of a causal link between ILSEAs results and economic growth, or, in more general terms, impeding education “crises” worldwide – which are increasingly taken as scientific truth (see Rappleye and Komatsu’s (2017) recent commentary about “flawed statistics” and new “truths” in education policymaking). These conditions also create an assumption of the existence of single and globally applicable “best practice” or “best policy,” which can uniformly inform policy-making and improve education in local contexts.  From our perspective, the challenge is to avoid the illusion of certainty that any quantitative measure provides. Granted the challenge is not easy to overcome because as Nobel Laureate Simon Kuznets (1934) affirmed:

With quantitative measurements especially, the definiteness of the result suggests, often misleadingly, a precision and simplicity in the outlines of the object measured. Measurements of national income (and we can add, of education) are subject to this type of illusion and resulting abuse, especially since they deal with matters that are the center of conflict of opposing social groups where the effectiveness of an argument is often contingent upon oversimplification. (pp. 5-7)

Our research shows that, on the one hand ILSEAs have the potential to provide governments and education stakeholders with useful and relevant modes of comparison that purportedly allow for the assessment of educational achievement both within cities, states, and regions, and between countries.  On the other hand, using idiotic lenses to analyze ILSEAs’ results – the good, the bad, and the ugly – without considering the strong influence of unequal educational opportunities in various contexts or acknowledging broader political or economic agendas driving the production and use of ILSEAs in education   – is dangerous.

Generating oversimplified narratives using ILSEAs, disregarding the different contexts and multiple obstacles, showing a lack of concern for the educational opportunities and rights of millions of children, and focusing all the energies on justifying your own opinions – while quickly discarding any counterevidence to legitimate your interests and benefits – is a genuine form of educational idiocy.  The best defense against educational idiocy? We already have discovered it, discussed it, experimented with it, assessed it, and considered the evidence: avoid the exclusive reliance on simplistic quantitative measures in determining education outcomes, shift attention away from short-term strategies designed to quickly climb the ILSEA rankings, implement proven strategies to reduce inequalities in opportunities to improve long term outcomes. Above all, stop pushing for education reforms based on a single, narrow yardstick of quality.

More than ever, we need to consider multiple types and sources of data, we need to explore more meaningful ways of reporting comparative data, we need to recognize the importance of the civic and public purposes of education, and we need to involve our diverse communities -parents, educators, administrators, community leaders, teacher union representatives, and students – in a public dialogue about what education is and ought to be about. Overcoming education idiocy would thus entail a return to the larger and more important educational questions than how a country performs on international large-scale assessments.

Acknowledgements

A shorter version of this essay appeared on Education International’s Worlds of Education blog.

References

Fischman, G. E & Topper, A. with Silova, I. & Goebel, J. (2017) An Examination of Perspectives and Evidence on Global Learning Metrics [Final Report for the Open Society Foundation]. CASGE Working paper 2. Tempe, AZ: Center for Advanced Studies in Global Education.

Fischman, G. E. (2016, May 12) The simplimetrification of educational research. World of Education, Blog of Education International.

Kuznets, S. (1934). National Income, 1929-32. U.S. Congress.

Parker, W. C. (2005). Teaching against idiocy. Phi Delta Kappan, 86(5), 344-351.

Rappleye, J. & Komatsu, H. (2017, July 6).  Teachers, “Smart People” and Flawed Statistics: What I want to tell my Dad about PISA Scores and Economic Growth, World of Education, Blog of Education International.

 

Interested in Knowledge & Impact stories from Mary Lou Fulton Teachers College? Read more today.

 

Measuring the un-measureable

As part of the organizing team of the Inaugural CIES Symposium in Scottsdale, AZ this past November, we were thrilled to continue the debates about Global Learning Metrics (GLMs) at the recent CIES 2017 Conference in Atlanta, GA. CIES 2017 included a number of Presidential Highlighted Sessions. Among these was “Measuring the un-measurable in Global Learning Metrics” featuring  four scholars: Karen Mundy (University of Toronto & Global Partnership for Education), Jill Koyama (University of Arizona), Aaron Benavot (University at Albany & UNESCO), and Gustavo Fischman (Arizona State University). Moderated by Iveta Silova (Arizona State University), the participants in this session discussed the merits, possibilities, tensions, and obstacles of GLMs and their influence on education, pedagogical practices, and policy.

Questions such as “What is usually dismissed as too difficult to measure?” and “Should global learning metrics reflect learning outcomes beyond basic numeracy and literacy?” guided the beginning of the debate. Later, Silova challenged the panelists to explain how GLMs could be transparent and explicit in making their underlying epistemologies visible. Silova also asked the panelists to explain their views on whether or not GLMs could originate from the Global South.

Video: Watch the full Presidential Highlighted Session.

In this Presidential Highlighted session, there was neither a consensus about the meaningfulness of current globally linked measures of learning outcomes nor was there a clear vision of how they will evolve in the future. Additionally, it was obvious from both the symposium and the CIES 2017 Presidential Highlighted Session that educational stakeholders disagreed on the purpose of a GLM–for some, it was to be used as a simple tool, helpful for gaining data on student literacy performance, while for others, it had been utilized as an overarching instrument that had been abused to rank countries in ways that did not account for contextual differences.

While the Presidential Highlighted Session at CIES 2017 in Atlanta unsurprisingly concluded without a consensus about GLMs, the four scholars brought attention to ideas to consider when examining the merits and disadvantages of creating and relying upon GLMs. Silova explained in her introduction to the debate that GLMs will continue to expand, and Fischman later expressed concern that GLMs, despite their decontextualized nature and limited capacity to measure what is happening in classrooms, have already driven educational policy and reform efforts. In light of the growth of these metrics and to understand their effects, addressing issues of feasibility is essential.

One of the salient themes in the debate focused on who is involved in the creation of GLMs and how GLMs are and will be applied. While Mundy argued for reinforcing national ownership of metrics, Koyama claimed that linking nations by a single global learning metric was not something that the policy and development communities should be working toward. Perhaps there was some overlap between the two sides, however, when Koyama suggested that because metrics do exist, there should not just be a single one. Rather, diverse and contextualized measures are essential so that the local needs are taken into account, and, for Koyama, these metrics should be applied from the bottom up. Koyama cautioned, however, that the data generated by any metric is not the solution. She also raised the ethical issue of authority when she asked about who gives education stakeholders the power to measure learning outcomes. Koyama problematized the use of nuanced metrics to “rank and diminish some [countries] while elevating others.”

Benavot agreed with Koyama’s argument that metrics should not follow a top-down approach. He concurred with Fischman that metrics should be meaningful and advance pedagogy. However, Benavot observed that the current metrics are not used to inform quality instruction at the classroom level. Instead they are used for policy-making and the distribution of resources. While Mundy argued that GLMs should be used for such distributive justice, Benavot disagreed that a single, cross-nationally linked metric was the correct avenue. He argued that focusing on GLMs distracts from what is happening in classrooms. According to Benavot, if metrics are to be linked, they should be emergent and develop from the local levels, then to the regional and national levels, rather than imposing them from the top down. Despite their dispute about the goal of GLMs, Mundy agreed that metrics should be grounded in national capacity building.

Where does this leave us in the GLM debate? What has been accomplished by holding a symposium dedicated to this topic in November and a follow-up highlighted session four months later? For one, it opened up a dialogue, providing educational stakeholders opportunities to define what they mean when using the term “global learning metric.” It also helped bridge a gap between academia and development to move the conversation in new directions. At the end of the Presidential Highlighted Session, an audience member and Benavot both raised an interesting question that might offer a different angle from which to approach the debate: Is there a universal learning progression that can be defined and measured?

Commentary: Are global learning metrics desirable?

In response to the publication of the United Nations’ Sustainable Development Goals (SDGs) in 2015, it has become necessary to develop a means of assessing progress toward their achievement. Included in the 17 goals of the 2030 agenda for sustainable development is SDG 4: Quality Education. This calls on nations to “ensure inclusive and equitable quality education and promote lifelong learning opportunities for all” (Goal 4, 2015). There is a mutually agreed upon goal among the United Nations’ 193 member states to increase access and educational quality globally, yet there is a lack of consensus about how to measure the achievement of that goal. Indeed, there is no catchall solution to the question of evaluating equity and access in education (Edwards, 2016). In the absence of a single, cross-nationally comparable metric, international tests such as the Programme for International Student Assessment (PISA) and the Trends in International Math and Science Study (TIMSS) have been used to rank the cognitive skills and academic achievement of the youth in participating countries. Although data from these tests are compared and ranked internationally, it is notable that only one-quarter of the United Nations’ member states participate in and report on TIMSS, and even fewer participate in PISA. These current metrics do not allow for the evaluation of the successful attainment of SDG 4 across all 193 member states of the United Nations. At best, they reduce global achievement to the accomplishment of a fraction of the globe.

To overcome the lack of a single test measuring global progress toward access to high quality education for all, some academics and policy-makers support the use of globally linked national metrics. For example, Hanushek argues that the cognitive skills measured in tests such as TIMSS and PISA are necessary for an educated labor force equipped to foster economic development (CIES Symposium, 2016; Hanushek, 2016). Others suggest that globally linked, cross-national indicators of educational attainment such as PISA and TIMSS provide data that helps bring attention and international aid to education as a means for international development (CIES Symposium, 2016; Mundy, 2016). In response to the need to measure global educational quality and access, Silvia Montoya, the director of the UNESCO Institute for Statistics, is concerned with harmonizing local- and regional-level metrics and coming to a global consensus about the basic literacy and numeracy skills that, once attained, would signal educational quality (CIES Symposium, 2016).

Video: Silvia Montoya explains that the harmonization of current metrics is needed.

However, the practical problem of measuring inclusive and equitable education globally should not require cross-national rankings. Instead, locally generated metrics should be applied in the context where they were gathered in order to improve national educational quality. In other words, locally generated measures of educational attainment could inform contextualized social and pedagogical changes for targeted educational improvement. Although SDG 4 calls for global attainment by 2030, until there is a way to compare national progress toward the achievement of SDG 4 without ranking nations hierarchically, the current metrics will only offer contextualized snapshots of student achievement and should not be cross-nationally ranked.

Hanushek argues that globally linked, cross-nationally comparable metrics provide a vision and model to “have-not nations” of what is possible (CIES Symposium, 2016).

Video: Eric Hanushek calls for the use of Global Learning Metrics to show what is possible to “have not” nations.

Yet, in doing so, such rankings serve to perpetuate the notion of the superiority of some nations over others. The problem of hierarchical comparisons is compounded by current metrics, such as PISA, that provide a myopic view of the performance of a small sample of primarily high-income countries. Despite the absence of the majority of nations in the current metrics, it is assumed that the high-performing, high-income, highly-developed countries are blueprints for the future of low-income, aid-dependent nations.

An additional attraction to the cross-national comparison of globally linked national metrics is that, according to Mundy, such comparisons are a way to highlight that learning outcomes are not equitably distributed. The end goal of drawing attention to learning outcome inequities would be to target the distribution of resources to countries most in need (CIES Symposium, 2016; Mundy, 2016). Nonetheless, such distributive justice could also occur within national contexts. In this case, if student success measures indicate that a greater investment in education is necessary, governments could respond by reallocating funds or increasing taxes to invest in human capital development through their national education systems. The use of locally generated metrics for national distributive justice could curtail international aid that is provided based on the assumption that donors know what aid-dependent countries need (Moyo, 2009). If global learning metrics were to justify an increase in aid with oppressive repayment conditions, the abuse of metrics for the maintenance of national educational systems that are aid-dependent could do more harm than good.

In an attempt to simplify the complex task of measuring the success of SDG 4, the UNESCO Institute for Statistics aims to harmonize local and regional metrics. However, instead of harmonizing metrics globally, nations should strive to humanize metrics locally. It is important to re-center the human–the individual students–in a discussion about potential avenues for increased educational quality. Measures of student achievement are rich, individual-level data that are aggregated, abstracted and decontextualized to become national rankings. These national rankings, such as those generated by PISA data, add credence to the argument that top-performing national education systems are models of what is possible for lower ranking and non-participating nations. As a result, top-performing nations model “best practices” that can be transferred to “have-not” nations. Pedagogical changes based on decontextualized national rankings distort student- and classroom-level data that could provide contextualized solutions aimed at improving educational quality.

The local need for pedagogical improvement is, therefore, in tension with the notion that national rankings inform best practices in education for economic development. Just as the current metrics do not adequately measure SDG 4 attainment, the current system does not adequately address a diverse conceptualization of what constitutes quality education. For this reason, the reproduction of a one-size-fits-all education model that can be assessed by a simple metric will not suffice.

References

Edwards, D. (2016). Are global learning metrics desirable? That depends on what decision they are attempting to inform. Retrieved from https://education.asu.edu/sites/default/files/ps_david_edwards.pdf

CIES Symposium. Edwards, D., Hanushek, E. A., Montoya, S. & Mundy, K. (2016, November). Are global learning metrics desirable? In I. Silova (Chair), The Possibility and Desirability of Global Learning Metrics: Comparative Perspectives on Education Research, Policy and Practice. Inaugural symposium of the Comparative and International Education Society, Scottsdale, AZ.

Goal 4.:. Sustainable Development Knowledge Platform. (2015). Retrieved December 05, 2016, from https://sustainabledevelopment.un.org/sdg4

Hanushek, E. A. (2016). Are global learning metrics desirable? Retrieved from https://education.asu.edu/sites/default/files/ps_eric_hanushek.pdf

Moyo, D. (2009). Dead Aid: Why aid is not working and how there is a better way for Africa. New York, NY: Farrar, Straus and Giroux.

Mundy, K. (2016, November 9). Setting the stage for the CIES Symposium on Global Learning Metrics (Karen Mundy). FreshEd with Will Brehm. Podcast retrieved from http://www.freshedpodcast.com/karenmundy/

The Possibility and Desirability of Global Learning Metrics

In November 2016, the Center for Advanced Studies in Global Education (CASGE) and edXchange at Arizona State University hosted the Inaugural Symposium of the Comparative and International Education Society (CIES) in Scottsdale, Arizona. The Symposium addressed the theme “The Possibility and Desirability of Global Learning Metrics: Comparative Perspectives on Education Research, Policy and Practice.” The Symposium brought together academics, practitioners, policymakers, educators, and social activists for an alternating series of keynote plenary debates and parallel sessions about the desirability and feasibility of global learning metrics. The participants came from 61 institutions, 17 different countries, and 17 states within the United States.

We selected global learning metrics as the focus of the Symposium because it is a timely and increasingly challenging educational and political issue at the center of multiple global debates about the future of education. Learning outcomes have recently been enshrined as central policy objectives in the new international education and development agenda. Unlike goals that seek to universalize access for education, for which consensus is strong, debates around learning are considerably more contested. Proponents argue that more robust global learning metrics have the potential to reduce academic disparities and improve learning outcomes for children across different contexts. Critics note that such universal measures typically focus on a narrow assessment of basic skills, while overlooking the importance of a more holistic approach to education, including human rights, aesthetics, morality, religion, or spirituality. Others call attention to the dangers associated with the emergence of the data-fixated punitive accountability regimes, privatization and marketization of public education, and a growing disconnect between systems, actors, and larger pedagogic changes. Some critics warn that global learning metrics can contribute to enacting hegemonic neocolonial globalization.  More broadly, the debate about the global learning metrics reveals an underlying tension in our field – a tension between the desire to replicate and scale up “best practices” (and an assumption that there is a global consensus on what constitutes “good” education), on the one hand, and the awareness about the importance of context, and deeply culturally contextualized education practice, on the other hand. Bringing a comparative perspective to the disjuncture between replicability and contextuality is one way our field can contribute to education research and practice broadly.

This raises the central questions, which guided the organization of this Symposium: Are global learning metric desirable and are they feasible? How can learning among children be measured and compared across diverse contexts and systems? Which learning domains should be assessed and why? How is learning revised or reframed for those who have less power or less “value” in the society in which they reside? How, if at all, are learning assessments actually used by governments, nongovernmental entities, teachers, curriculum developers, and other stakeholders? The Symposium brought together a group of researchers, policymakers, practitioners, and activists for a focused intellectual and policy engagement around these questions. While not designed to forge consensus or alignment, the Symposium was a step towards linking together academic research and policy debates in order to enable critical reflection, innovation, and proactive action in the area of developing global learning metrics.

The Symposium featured 4 plenary keynote debates, which addressed issues ranging from the desirability and feasibility of global learning metrics to their potential to be pedagogically innovative and culturally responsive. In addition, we offered 3 workshops and 65 parallel presentations a range of topics related to Global Learning Metrics. The plenary keynote debates were recorded and are available for viewing below.

Plenary keynote debate #1: Are global learning metrics desirable?

Moderated by: Iveta Silova

Panelists: David Edwards, Education International; Eric Hanushek, Stanford University; Silvia Montoya, UNESCO Institute for Statistics; Karen Mundy, Global Partnership for Education

Learning metrics of any sort are necessarily politicized, as they raise issues with clear philosophical, technical and policy dimensions. The first keynote session of the symposium set the stage for the debate by focusing on the different actors and rationales behind the development of global learning metrics. The guiding questions included: Are global learning metrics desirable and why? What are the end goals of global learning metrics from the panelists’ particular disciplinary and institutional perspectives? Why do we need GLMs at this particular moment? What are the main political challenges and opportunities? Who should be in charge of the development of global learning metrics and who should pay for it? What role should nation-states, international agencies, NGOs, teacher unions, academics, and other actors play in coordinating efforts to develop global learning metrics?

Video: Watch Eric Hanushek and David Edwards debate about why we need GLMs now.

Plenary keynote debate #2: Are global learning metrics feasible?

Moderated by: Gustavo E. Fischman

Panelists: Monisha Bajaj, University of San Francisco; Aaron Benavot, UNESCO Global Monitoring Report; David C. Berliner, Arizona State University

Developing learning metrics is a complex and contested enterprise. It is one of the biggest political, pedagogical and technical challenges of contemporary educational systems. The second plenary debate focused on the feasibility of the development of global learning metrics, addressing the following questions: How can we measure and compare educational achievement and outcomes across diverse contexts and educational systems? Can GLMs capture educational outcomes beyond the basic numeracy and literacy skills? What balance can be sought between the assessment of basic numeracy and literacy skills and the measurement of learning related to informational technologies, citizenship, human rights, sustainability, aesthetics, morality, religion and/or spirituality? In other words, how can we measure what it often pronounced as “too difficult to measure” though it is at the core of teaching and learning?

Plenary keynote debate #3: Can global learning metrics be pedagogically innovative?

Moderated by: Sherman Dorn

Panelists: Chris Higgins, University of Illinois at Urbana-Champaign; Radhika Gorur, Deakin University; Pasi Sahlberg, University of Helsinki

In the discussion of global learning metrics, the global data banks and the International Large-Scale Assessments (ILSAs) – so-called “big data” – have played a central role. While producing new information and important insights about educational systems and learning outcomes, the big data movement has serious limitations.The third keynote debate was structured around the following guiding questions: How well is big data suited to help us make decisions about improving teaching and learning in schools and classrooms? Do global learning metrics actually allow for pedagogical innovation or do they narrow pedagogical practices? What are alternative assessment and measurement tools that could complement global efforts of increasing educational access and outcomes, as well as improving teaching and learning?

Plenary keynote debate #4: Can global learning metrics be culturally responsive?

Moderated by: Gustavo E. Fischman

Panelists: Supriya Baily, George Mason University; Stafford Hood, University of Illinois at Urbana-Champaign; Hugh McLean, Open Society Foundations; J. Douglas Willms, University of New Brunswick

The concept of global learning metrics is based on the assumption that there is an agreement about what constitutes “good” and “quality” education worldwide. However, efforts to develop global learning metrics have often neglected the diversity of cultural contexts and educational systems. The symposium concluded with a debate about whether global learning metrics are culturally responsive. The panelists were asked to address the questions: Is there a global core of fundamental knowledge, skills and competencies that are relevant across different countries? How can GLMs capture the dynamics of race, ethnicity, class, gender, religion, and other factors that contribute to students’ cultural identities? More broadly, how can GLMs be more culturally responsive and relevant in the context of uneven power dynamics globally?

These four debates highlight some of the perspectives about the desirability and feasibility of GLMs, but there are many ways to approach the debate, and these diverse opinions will be featured in forthcoming blog posts.