Part 1: The Problem with High Stakes Tests
One of the recommendations in the Glaze report, was for an external office to oversee standardized testing, a “Student Progress Assessment Office”. But what do we mean by standardized tests? How are they different from ones the teacher makes up?
Standardized tests are created outside the classroom, and are used to compare students, classes, programs, and policies. They are often created by departments of education, or by private business interests (textbook publishers are major players in the industry). A useful distinction is based on what happens with the results. High stakes tests are those whose scores are used to assign serious consequences for educators, students or schools. Test scores which are used for “streaming” children into various levels or to determine students’ career prospects at an early age are a prime example of “high stakes”. High stakes tests can also be used to determine consequences for schools. In Britain, where children sit standardized exams known as SATs every 4 years, League Tables publish every school’s results, and those schools at the bottom can be deemed failing, and closed. Teachers can be judged based on their students’ performance on these tests, and sometimes paid accordingly. Children can be denied access to schools and programs based on them – in Britain, SATs are extremely high stakes. More detail on the negative effects of high stakes tests can be found in previous blog posts. (Joy in Learning)
The purpose of low stakes tests is diagnosis – the objective is to assist learning by providing information to administrators, to teachers and to the takers of the tests which can help them improve. The main comparison is between an individual’s or program’s progress over time. For example, a teacher may administer a diagnostic reading assessment to an individual child, looking to assess their strengths, progress since the last assessment and their needs. When these results are acted upon, there is a direct benefit to the student. Sometimes tests are administered to a representative sample of the population to assess how a program or a teaching approach is working. These can be very valuable to educational planning and curriculum development, but the results of individual students or teachers are not singled out.
My criteria for a good assessment tool is simply this: does it help students learn? Most high stakes tests do not help the students who take them improve their own learning – and are not intended to. Either students get results too late to do anything about them, or the results are simply a grade or number, with no explanation of how it was achieved.
Low stakes tests, on the other hand, can help students, teachers and administrators learn. Large scale, randomly sampled tests can give us valuable information about policies and programs that work. The Programme for International Student Assessment (PISA), administered by the OECD for the past 18 years, is an example of a test that tries to do precisely that. It is given to 15 year olds around the world every 3 years and measures “what students know and can do” in reading, math and science. Because it also gives students and teachers questionnaires to gather data on many other aspects of their lives, such as socio-economic status, attendance records, attitudes towards school etc., PISA data can be used to analyze which factors matter most for learning. And because it is low stakes, and is only administered to a sample of students, there are no adverse repercussions of low scores for the takers of the tests or their teachers – which means that results are not skewed by “teaching to the test” and drilling students in how to maximize test scores.
During my university days, I took courses in test design, and an important thing I learned is how difficult it is to design a test that is truly objective – cultural, class and intellectual biases are inherent in most standardized tests. I was happy to teach at a school for most of my career that did not rely on them, and in fact used a variety of “authentic assessment” measures designed to help students learn from their work. I look at all standardized test results with healthy skepticism – they can be a useful diagnostic when well designed, but even low stakes tests on randomized samples are blunt instruments for measuring a school’s or program’s worth.
Because of these experiences, I am always amazed when I hear government or business people talk about standardized tests as if they were some kind of holy grail of measurement. I find it remarkable when a business leader (who has no personal experience of education within the last forty years) can talk about a “national emergency” because test results drop by 2%. https://www.theglobeandmail.com/news/national/education/canadas-fall-in-math-education-ranking-sets-off-red-flags/article15730663/ Apparently, to some people, these tests are more “objective” than teachers’ feedback – an attitude that has contributed to a lack of trust of teachers. Educators know how unreliable they are; plus, they see the fall-out of high stakes tests in the narrowing of the curriculum, children’s stress, and the sometimes drastic consequences that follow. Educators around the world are leading the charge against high stakes tests.
PISA, on the other hand, is meant to be low stakes and has over the years produced evidence-based, statistically significant connections between policies and academic results. Each round of testing produces about 6 huge volumes of analysis. For example, they have found that countries where “streaming” (based on high stakes tests) is practiced perform less well on average. Years of early childhood education are positively correlated to PISA results. “PISA 2012 also finds that the highest-performing school systems are those that allocate educational resources more equitably among advantaged and disadvantaged schools and that grant more autonomy over curricula and assessments to individual schools.” https://www.oecd.org/pisa/keyfindings/pisa-2012-results-volume-I.pdf PISA has even pointed out the positive impact of principals and teachers collaborating – something our government has chosen to ignore.
Sometimes there are problems with the way PISA compares countries, particularly when certain cities in China are considered as separate countries, and then not surprisingly get the highest results in the world (Shanghai for example). Mary Campbell’s article in the Cape Breton Spectator points out some other problems. https://capebretonspectator.com/2018/02/21/pisa-assessment-glaze-ns-schools/ However, PISA has produced some valuable directions for educational policy makers, if they choose to listen, and is a rough barometer of how our schools are doing compared to the rest of the world. Canada does exceptionally well on PISA, and Nova Scotia holds its own within Canada as a small, less well-off province.
Test scores can be seriously misinterpreted – as I have discussed in a previous blog post, “Why a College of Educators?”. We have just seen how the Glaze report has used them to justify its recommendations, and how the government has incorporated them into Bill 72, the Nova Scotia Education Reform Act, which was passed yesterday. But nothing in Bill 72 will improve standardized test results (nor will it help children learn, which is not the same thing). The biggest impact may well be an increase in the government’s ability to impose more tests on the school system, to try to control teachers and their “managers” in finer detail, to impose new programs on schools and soon to justify more privatization. It has happened in Britain, and it’s not pretty. It will happen here if we do not fight the implementation of this bill.
Oxford lecturers, on strike for several weeks, have now won their battle…the university will keep their pensions the way they were.