New project: An alternative to standardized tests

If you could create an ideal dataset of students’ and teachers’ experiences at their schools, what would be in it? How might you use it?  And, specifically, would you ask about school diversity or school integration?

I’m asking myself similar questions now, as I start a new position as the director of School Quality Measures for the Massachusetts Consortium for Innovative Educational Assessment (MCIEA). And, I’m genuinely interested in readers’ thoughts on this. 

MCIEA uses surveys of teachers and students, performance-based student assessment and administrative school-level data to measure schools in a much more holistic way than would ever be possible with standardized, multiple choice exams. By definition, data collected for the project are not used for accountability decisions. It currently includes 8 school districts in Massachusetts, all of whom joined voluntarily. 

Further below, I have additional details about MCIEA. But, first – some background on the connection between school integration and school measurement. I think these are often viewed as non-overlapping topics, but they shouldn’t be. Indeed, much of contemporary school segregation is held in place by dominant (and extremely flawed) ways of measuring schools that are connected to a long history of racial discrimination. Here are some of the sources that are most influential in my thinking:

In the last year, I completed a post-doc at Penn State’s Center for Education and Civil Rights. (At that time, SD Notebook merged with CECR, and I should note that they will remain affiliated, even though my post-doc has ended.) We held a conference for the 65th anniversary of Brown v. Board, where Nikole Hannah-Jones gave the keynote. Here’s part of what she said-

Nikole Hannah-Jones at Penn State’s Brown@65 Conference

I’m not sure that a single picture could better sum up my move from CECR to MCIEA. As described in an earlier post, Nikole Hannah-Jones reminded us all that standardized testing was developed by eugenicists and standardized tests were (and still are!) used to give scientific authority to the myths about Black inferiority invented to justify slavery

Of course, many others have made similar arguments – Wayne Au is easily among the most poignant voices on this. Just last month, he published a piece in Rethinking Schools, with the subtitle “white supremacy, high-stakes testing, and the punishment of Black and Brown students.” He also wrote a short journal article with Matthew Knoester called “Standardized testing and school segregation: like tinder for fire?” I highly encourage reading these. A few key arguments and related excerpts from each-

Tests do not accurately measure teaching and learning

  • “Test scores correlate most strongly with family income, neighborhood, educational levels of parents, and access to resources — all factors that are measures of wealth that exist outside of schools.”
  • “While teachers are central to how our children learn and experience education, the tests offer such narrow measures that they miss most of the processes, experiences, and relationships that define teaching and learning.”

Tests are used to provide scientific cover for racist views of intelligence 

  • “In 1916, based on standardized test scores, Stanford Professor Lewis Terman (one of the founders of standardized testing in U.S. schools) argued that certain races inherited “deficient” IQs, saying that “No amount of school instruction will ever make them intelligent voters or capable citizens.” He further asserted that “feeblemindedness” was “very, very common among Spanish-Indian and Mexican families of the Southwest and also among negroes [sic],” and suggested that “Children of this group should be segregated in special classes and be given instruction that is practical . . .”” 
  • “With the authority bestowed by such ‘scientific’ findings, eugenicists – who believed in the genetic basis for behavioral and character traits they associated with gender, race, and class differences – advocated that race mixing was spreading the alleged inferior genes of African Americans, other non-white peoples, and immigrants.” (See this book for more background.)

Tests are particularly harmful to schools that disproportionately serve students of color

  • “Low-income kids and kids of color are tested more; experience the greatest loss of time spent on non-tested or less-tested subjects like art, music, science, and social studies; don’t have multicultural, anti-racist curriculum made available to them because those areas are not on the tests; and lose opportunities for culturally relevant instruction because the tests tend to inhibit process-based, student-centered instruction in favor of rote memorization.”
  • “Since on the whole, low-income and communities of color perform poorly on these tests, research has consistently found that the pressures of high-stakes, standardized testing are greatest in states and districts with large populations of non-white students, and that the narrowing of the curriculum to align with the tests is sharpest in schools with large, populations of non-white students.”

High-stakes standardized tests function as a proxy for whiteness

  • “Test scores can serve as a proxy for parents to make functionally racist judgments about school and educational quality without talking about race explicitly.”
  • “Testing allows parents and others to avoid the stigma of saying out loud that they favor segregation as they choose schools with a whiter and richer population for their own children, and also provides justification for their support of segregation within schools.” 

A few other, quick examples – Amy Stuart Wells, who has a long record of influential research on school integration, used her 2019 AERA presidential address to focus on standardized testing and school segregation. In part, she argued that:

  • “Standardized tests punish students for not knowing what someone who does not know them decides they need to know.”
  • And, she asked the audience of educational researchers to “reimagine education and how it is measured. We need to reexamine what and who is defined as deviant, and excluded.”

She opened the address with a short play (25 mins) where youth artists/activists from the Epic Theater Ensemble dramatize the emotional impact of testing on K-12 students. And, following the address, she premiered a short documentary (22 mins) about the relationship between testing, curriculum, discipline and school segregation, which includes Jack from MCIEA. You can watch everything here

And, there are similar points in the popular new book, “How to be an anti-racist” by Ibram X. Kendi:

  • “As historian Ibram X. Kendi, author of the recently published ‘How to Be an Antiracist,’ explained recently, ‘For a hundred years, Americans have been making the case that Black people, Latino people are not achieving intellectually as much as other people, as much as white people. And I would argue, no, the problem isn’t with these test takers; the problem is with the tests themselves.'” (From this Daily News letter to the editor.)
  • “The use of standardized tests to measure aptitude & intelligence is one of the most effective racist policies ever devised to degrade Black minds & legally exclude Black bodies.”
  • See more in this twitter thread from James Noonan, my predecessor at MCIEA, who helped develop the survey instruments and data analysis system. He is now a faculty member at Salem State University.

Clearly, there’s a lot wrong with a system that relies on narrow measures of student learning whose very origins come from racist psuedo-science of human intelligence. We see this all the time in the debate about school integration, especially in racialized notions about “good” schools and “bad” schools and parents’ fears about sending white students to diverse schools.

MCIEA is one possible response to the shortcomings of standardized assessment, and its data has a lot to offer to the conversation about school diversity. The project was co-founded in 2015 by Jack Schneider, who has written extensively on problems with test-based standardized assessment and who is a co-host (along with Jennifer Berkshire) on the popular Have you Heard podcast about education policy/history. Jack was a guest on a different podcast, where he gives a good overview of his work and the vision behind MCIEA. You can also read more on this fact sheet and in the links in this post which largely come from James Noonan’s blog posts throughout various stages of the project. 

As part of my job, I’ll be leading the survey work, which is based on a very straightforward premise: it’s possible to learn a lot about a school by asking the teachers who work there and the students that it aims to serve. We use the surveys (details here) along with the administrative data to measure school quality (but not rate schools against each other!) according to the following 5 categories

We also partner with consortium schools on using their data for holistic school improvement. All data are available on an online dashboard that displays results on each of the 5 categories and related sub-scales. Here’s an example of what that looks like:  

In addition to providing schools with more holistic/useful/timely data than those provided by standardized exams, I’m obviously motivated by what the survey results can tell the public about school quality and school diversity. Our consortium captures a lot of the variety of school racial composition in Massachusetts, from predominately white schools to those that predominately serve students of color and variations in between. So, we can disaggregate according to “segregated” and “diverse” schools (putting aside the complexities of defining those terms) and see if there are differences in students’ experiences as reported on the surveys. Especially because much of the school diversity research relies on test scores to measure benefits for students, we are hoping this kind of work can add important nuance to our understanding of the benefits of school diversity. 

In particular, I’d like to dig into differences in “segregated schools” – I think this term has come to specifically refer to schools that predominately serve students of color. However, there are of course many segregated white schools as well, there’s major differences in the way these places serve their students, and both kinds of segregation compromise the notion of education as preparation for participation in a multicultural democracy. So, that leads me to these questions:

  • Are there overall differences, by racial sub-group, when comparing survey ratings from segregated white schools against ratings from segregated non-white/global majority schools?
  • How do students of color in segregated white schools rate their schooling experiences? And, how do their ratings compare to students of color in segregated global majority schools?
  • How do white students in segregated white schools rate their schooling experiences? And, how do their ratings compare to white students in segregated global majority schools?

What would you want to know? Let me know in comments or feel free to reach out to my new email address:

2 thoughts on “New project: An alternative to standardized tests

  1. People should acknowledge that there is a lot of racism built into tests and the testing regime and the way the results are used, as discussed here. A related huge issue is the ways tests have entirely distorted our curriculum, teaching and schools. Still, standardized test results need not be meaningless or useless. We are just not using them at all right. The big issue is that tests are serving the wrong purposes.

    The problem with statements like “Tests do not accurately measure teaching and learning” is that it pretty much says that there is no standardized test that could, and the results of all standardized tests are mostly meaningless. That is plainly just not true, and makes a lot of people reject everything else in these discussions (which is a shame).

    “Test scores correlate most strongly with family income, neighborhood, educational levels of parents, and access to resources — all factors that are measures of wealth that exist outside of schools.” OK, but a correlation is not an answer or a policy conclusion. Even though that statement is constantly used as if it is. The conclusion to be drawn is *not* that tests only measure these things and therefore tests cannot be meaningful

    The real question is causation, not correlation. The real conclusion we should all discuss is that if we provide those things (resources, support, good learning environments) to all students they will learn and thrive. The causation is:

    (A) Resources and support -> (B) actual learning -> (C) higher standardized test scores

    Give all students A and they can do B and C. B actually correlates with C, despite the widespread suggestion that because it is not perfectly accurate it is meaningless.

    Another way to say it is that because of this CAUSATION (not correlation), test results often reflect disadvantage, not only learning. Tests measure both A and B. So let’s make sure everyone has A, so they can get B.

    To state or imply that students’ performance on a well designed test cannot be meaningful is deeply unhelpful to everyone, most importantly to the students themselves. To imply that we should not be working to get *ALL* student performance up to their potential, using meaningful assessments, is just wrong.

    Of course, we still must design much better ways to assess performance. We still must sharply reduce standardized tests role in school system outcomes. And really more importantly, use tests in ways that help students, rather than in ways that promote racism and perpetuate disadvantage.


    • Thanks for reading and for this very thoughtful comment. I’d respond with 2 quick points-

      1) Despite the measurement issues discussed in the post, I do still think there’s a way tests can be useful. In particular: when not tied to high-stakes punishments and when analyzed according to growth, when used by local-level practitioners with deep knowledge of students, etc. So the line about not accurately measuring teaching and learning was meant as an argument that we shouldn’t attach high-stakes consequences to them, etc, not that they aren’t useful at all. I definitely agree that is plainly not true and that I could have explained it better in the post.

      2) I think I have a more cynical perspective on the chain of causation that you outlined. I see tests reason/justification for not providing A to particular schools (often schools with majority low-income and/or students of color). The part about “if we provide those things to all students” is major, and, as a society, we’ve never even come close to it – reliance on high-stakes standardized tests is one of the latest (and most effective) ways that we’ve avoided this. One quick example from MA (where I live) – in a recent school funding bill, the governor proposed withholding state aid from schools that 3+ years of low tests scores – This kind of thinking is everywhere.


