Lies, damned lies, and Ofsted’s pseudostatistics

networking.jpg

First published at physicsfocus.

It’s been a week since Michael Gove was unceremoniously given the boot from his role as Education Secretary. The cheers of teachers still echo around staff rooms and schoolyards up and down the country.

Gove was variously described as incredibly unpopular, a hate figure, utterly ruthless, and a “toxic liability”. And that was just by his colleagues in the Coalition. (Allegedly.) Those who shared his simple-minded, wilfully uninformed, and proto-Victorian views on education, including a certain Richard Littlejohn, saw Gove’s unpopularity as arising simply because he was driving through what they considered to be essential reforms of an ailing education system. (My deep apologies for the preceding link to a Daily Mail article and its associated sidebar of shame. It won’t happen again. I also offer a posthumous apology to those Victorians who would likely have baulked at the suggestion that their educational methods were as backward-looking as those of Gove.)

Just why are Littlejohn and his reactionary ilk so certain that the English education system is, as they’d have it, going to hell in a handcart? A very large part of the reason is that they naively, quaintly, yet dangerously assume that education is equivalent to a competitive sport where schools, teachers, and children can be accurately assessed on the basis of positions in league tables. What’s worse – and this is particularly painful for a physicist or, indeed, anyone with a passing level of numeracy, to realise – is that this misplaced and unscientific faith in the value of statistically dubious inter-school comparisons is at the very core of the assessment culture of the Office for Standards in Education, Children’s Services and Skills (Ofsted).

An intriguing aspect of the swansong of Gove’s career as Education Secretary was that he more than once ‘butted heads’ with Michael Wilshaw, head of Ofsted. One might perhaps assume that this was a particularly apposite example of “the enemy of mine enemy is my friend”. Unfortunately not. Ofsted’s entirely flawed approach to the assessment of schools is in many ways an even bigger problem than Gove’s misplaced attempts to rewind education to the halcyon, but apocryphal, days of yore.

Moreover, Gove’s gone. Ofsted is not going anywhere any time soon.

I’ve always been uncomfortable about the extent to which number-abuse and pseudostatistics might be underpinning Ofsted’s school assessment procedures. But it was only when I became a parent governor for my children’s primary school, Middleton Primary and Nursery School in Nottingham, that the shocking extent of the statistical innumeracy at the heart of Ofsted’s processes became clear. (I should stress at this point that the opinions about Ofsted expressed below are mine, and mine alone.)

Middleton is a fantastic school, full of committed and inspirational teachers. But, like the vast majority of schools in the country, it is subject to Ofsted’s assessment and inspection regime. Ofsted’s implicit assumption is that the value of a school like Middleton, and, by extension, the value of the teachers and students in that school, can be reduced to a set of objective and robust ‘metrics’ which can in turn be used to produce a quantitative ranking (i.e. a league table). Even physicists, who spend their career wading through reams of numerical data, know full well that not everything that counts can be counted. (By the way, I use the adjective “inspirational” unashamedly. And because it winds the likes of Littlejohn and Toby Young up. As, I’d imagine, does starting a sentence with a conjunction and ending it with a preposition.)

But let’s leave the intangible and unquantifiable aspects of a school’s teaching to one side and instead critically consider the extent to which Ofsted’s data and processes are, to use that cliché beloved of government ministers, fit for purpose. In its advice to governors, Ofsted – rather ironically, as we’ll see — stresses the key importance of objective data and highlights that the governing board should assess the school’s performance on the basis of a number of measures which are ‘helpfully’ summarised at websites such as the Ofsted Data Dashboard and RAISE Online.

Ofsted’s advice to governors tacitly assumes that the data it provides, and the overall assessment methodology which gives rise to those data, are objective and can be used to robustly monitor the performance of a given school against others. Let’s just take a look at the objective evidence for this claim.

During the governor training sessions I attended, I repeatedly asked to what extent the results of Ofsted inspections (and other Ofsted-driven assessment schemes) were reproducible. In other words, if we repeated the inspection with a different set of inspectors, would we get the same result? If not, in what sense could Ofsted claim that the results of an inspection were objective and robust? As you might perhaps expect, I singularly failed to get a particularly compelling response to this question. This was for a very good reason: the results of Ofsted inspections are entirely irreproducible. A headline from the Telegraph in March this year said it all: Ofsted inspections: You’d be better off flipping a coin. This was not simply media spin. The think-tank report, “Watching the Watchmen”, on which the article was based, actually goes further: “In fact, overall the results are worse than flipping a coin”.

It’s safe to say that the think-tank in question, Policy Exchange, is on the right of the political spectrum. It is also perhaps not entirely coincidental that one of its founding members was a certain Michael Gove, and that the Policy Exchange report on Ofsted was highlighted by the right-of-centre press during the period of spats between Wilshaw and Gove mentioned above. None of that, however, detracts from the data cited in the report. These resulted from the work of Robert Coe and colleagues at Durham University and stemmed from a detailed study involving more than 3000 teachers. Coe has previously criticised Ofsted’s assessment methods in the strongest possible terms, arguing that they are not “research-based or evidence-based”.

Ofsted asks governors to treat its data as objective and make conclusions accordingly. However, without a suitable ‘control’ study – which in this case is as simple as running independent assessments of the same class with different inspectors – the data on inspections simply cannot be treated as objective and reliable. In this sense, Ofsted is giving governors, schools, and, more generally, the public exceptionally misleading messages.

But it gets worse…

The lack of rigour in Ofsted’s inspections is just one part of the problem. It’s compounded in a very worrying way by the shocking abuse of statistics that forms the basis of the Data Dashboard and RAISE Online. Governors are presented with tables of data from these websites and asked to make ‘informed’ decisions on the basis of the numbers therein. This, to be blunt, is a joke.

It would take a lengthy series of blog posts to highlight the very many flaws in Ofsted’s approach to primary and secondary school data. Fortunately, those posts have already been written by a teacher who has to deal with Ofsted’s nonsense on what amounts to a daily basis. I thoroughly recommend that you head over to the Icing On The Cake blog where you’ll find this, this, and this. The latter post is particularly physicist-friendly, given that it invokes Richard Feynman’s “cargo cult science” description of pseudoscientific methods (in the context of Ofsted’s methodology). It’s also worth following Icing On The Cake on Twitter if you’d like regular insights into the level of the data abuse which teachers have to tolerate from Ofsted.

Coincidentally, I stumbled across that blog after I had face-palmed my way (sometimes literally) through a meeting in which the Ofsted Data Dashboard tables were given to governors. I couldn’t quite believe that Ofsted presented the data in a way such that the average first-year physics or maths undergraduate could drive a horse and carriages right through it (if you’ll excuse the Goveian metaphor). So I went home and googled the simple term “Ofsted nonsense”. Right at the top of the list of hits were the Icing On The Cake posts (followed by links to many other illuminating analyses of Ofsted’s assessment practices).

I’m not going to rehash those posts here – if you’ve got even a passing interest in the education system in England you should read them (and the associated comments threads) for yourself and reach your own conclusions. To summarise, the problems are multi-faceted but can generally be traced to simple “rookie” flaws in data analysis. These include:

  1. Inadequate appreciation of the effects of small sample size;
  2. A lack of consideration of statistical significance/uncertainties in the data. (Or, at best, major deficiencies in communicating and highlighting those uncertainties);
  3. Comparison of variations between schools when the variation within a given school (from year to year) can be at least as large;
  4. An entirely misleading placement of schools in “quintiles” when the difference between the upper and lower quintiles can be marginal. Ofsted has already had to admit to a major flaw in its initial assignment of quintiles.

What is perhaps most galling is that many A-level students in English schools will be taught to recognise and avoid these types of pitfall in data analysis. It is an irony too far that those teaching the correct approach to statistics in English classrooms are assessed and compared to their peers on the basis of Ofsted’s pseudostatistical nonsense.

Image: Manipulating data. Credit: https://www.publicdomainpictures.net/en/view-image.php?image=45242

Author: Philip Moriarty

Physicist. Rush fan. Father of three. (Not Rush fans. Yet.) Rants not restricted to the key of E minor...

10 thoughts on “Lies, damned lies, and Ofsted’s pseudostatistics”

  1. Excellent post. What’s the point in highlighting the concepts of the ‘scientific method’ to students when the very body who is supposably assessing the quality of their school is such an exemplar of bad practice.
    Next thing we know we’ll have an environment minister who is a climate change denier, an agricultural minister who thinks culling badgers will eradicate TB in cattle and a health minister who is happy to pay for witch doctors.
    Pa! Who needs science anyway. We can run the country just fine without it

    Like

  2. The main problem here appears to be that policy makers are demanding data so that they can make decisions. I think, given this pressure, civil servants are convincing themselves that because they are using the data in a comparative fashion, that the basic underlying problems in the data are somehow cancelled out. They have to generate quantitative data, and this is the best that they can do, so that’s what they do. This is clearly nonsense. Politicians demanding data is a powerful cause of nonsense, I think.

    Like

    1. That’s a perceptive point. But until schools, governors, and (Head)teachers start to refuse to participate in this nonsense it is not going to go away. (And universities are no better — how many base their ‘strategies’ on chasing league table positions in dubious global rankings? ).

      “Everyone knows” that the assessment methods are nonsense; “everyone knows” that the data are boll**ks; “everyone knows” that, as John states in his comment below, the measurement/assessment process distorts the data. And yet nothing changes because we have to be “grown up”, act pragmatically, and “play the game”.

      I’ll quote the venerable George Bernard Shaw again:

      The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.

      Like

  3. As someone who just finished A Levels I saw the strange ways of Ofsted for myself. I would add to your takedown of their methodology, by saying that the current system of inspection itself is flawed. At my college the teachers were told about a month in advance that Ofsted would be coming sometime that month which naturally made them rather stressed which then passed on somewhat to us. It’s not just stress that is a problem though, the disruption to lessons is ridiculous; my physics teacher would try to make her lessons as close as possible to what she thought the Ofsted ideal was, thus making the inspection pointless. I think inspections either need be very routine so that teachers are comfortable with someone judging them in lessons (although how someone can get used to that I don’t know!) or a totally different methodology.

    But I disagree with you about Gove’s views being totally Victorian. Things like academic rigour in the syllabi, especially A Level maths, isn’t Victorian; it’s a universal value in academia, or at least one we should strive to achieve like scientific integrity. I didn’t agree with many of his reforms & views at all, like free schools, academies or league tables, but those didn’t mean that all his ideas were bad.

    I think the way some teachers dismiss all of Gove’s reforms instead of assessing them on their own merit falls into the cargo cult science that Feynman talks about; teachers should be the last profession to become so sure of the way they do things that it becomes unquestionable to do it differently.

    Anyway, thank you for reading this, I’d love to read your response; Ofsted isn’t questioned enough.

    Like

    1. Thanks for a great comment, John. From what you say, Ofsted’s inspections are another example of Goodhart’s law in action.

      As regards my comments on Gove’s proto-Victorian stance, it’s not a question of rigour. Indeed, I have quite some sympathy with the idea of introducing more mathematical rigour into the physics A-level — see this video , for example — although think we should really think about dramatically increasing the amount of computing in the physics course as well. (I’m far from a good mathematician but I am — or at least used to be — a reasonable coder. My mantra during my undergrad and postgrad student days was “If I can’t code it, I don’t understand it”).

      But education is very much more than just physics, of course. I wrote the post from my perspective as a school governor. My comments about Gove’s Victorian stance arose from his damaging inability to appreciate that teaching is much more than the traditional rote learning/memorisation of facts and their regurgitation, and that there is more to education than the narrow type of 50s-style curriculum he favoured. And that teaching is not explanation by an authority figure. And that a thorough grasp of your subject does not necessarily make you a good teacher. And that there’s value in vocational subjects. And coursework. And that his free school ideology is unsupported by detailed evidence or research.

      etc.. ad nauseum

      Like

      1. Yes it fits Ofsted very well. I think something is very wrong when the whole point of Ofsted, uncovering poor teachers as I was told by an inspector themselves, is basically impossible to do with the current model; at least in my college, every teacher would have appeared the same to Ofsted (by trying to mimic its ideal). Unfortunately, I don’t think this government or the next are interested in reforming Ofsted at all.

        As someone who programmes a little in their spare time, I very much agree with more programming in physics; for myself, I found that programming often helped with logic/mathematical thinking when I was younger. Actually, Gove did introduce programming to the curriculum (including primary school, if my memory is correct), but I feel an opportunity has been missed by not integrating it in with other subjects at GCSE & A Level; programming is a tool, even in computer science, so to miss the applications is a real shame; however the new A Level maths syllabus does encourage computing (not explicitly programming though), & hopefully teachers will incorporate it, the syllabus is much looser, so there is room for it.

        Perhaps the reason Gove’s reforms have been worse for primary or early secondary education is that he has taken a hands-on approach to them, rather than delegating as with later forms; again, contrasting with the A Level maths syllabus, which was composed by a group taking input from universities, & professional bodies, like the Royal Statistical Society.

        On a tangent, governors are probably more important than I think most people realise. I’ve met one of the governors at my college, who wanted to help me & another student with UCAS, because of the universities we were applying to, & it was wonderful talking to someone who had such an interest in the education the college provided, & what the students would achieve!

        Thanks for replying, this comment is getting a bit long, so I shall stop here.

        Like

  4. [I would like to sincerely apologies for the longevity of this post, so I do thank you for your time if you read it. Being in school at the moment, it is something I feel quite strongly about.]
    The way “data” is presented isn’t even the first problem. Bearing in mind an “inspection” has to be done first before anything can be produced, I feel the biggest change has to be there. That said I have just finished my AS levels and begun my A2 levels, so I guess the end which I see occurring is more prevalent to me.

    I’d also like to point out my school obtained an “Outstanding” in all categories. Take this as you will, but it is a mark which holds little meaning to me for the reasons John and yourself have pointed out. In fact, the only reason I’m aware of it is because it is shoved in my face every other week during assembly. I also don’t tend to read Ofsted reports, simply because the idea of degrading how well a school stands to one of four adjectives hardly seems like the right way of doing it. And no, that doesn’t mean adding more adjectives that could be used to describe a school in one or two words will fix it.

    I think the main issue to be addressed is how long the school has to prepare for an “inspection”. As John points out, they’re told a month in advance that it will happen, so the “perfect” day is planned. A prime example of how wrong of an impression it gives is my school’s inspection. The day was, well, different, to what we usually have.

    First of all, teachers were greeting us at the school gates. Unusual as they’re usually in meetings. Then we see the headteacher make an appearance, talking to students, no doubt threatening them to be good or be expelled. Not see that since the day.

    Then, some teachers had lesson plans laid out, which was a first. Typically it’s not done in my school because the rate at which people learn is different for different topics, so one thing may take half a lesson while another may take two. We’re also encourage to have discussions in my classes where we discuss the topics with my teachers and try to challenge them in what they say, with other students adding support or there own thoughts into the matter. This works well as the teacher can see if we understand what they are saying and it puts them on the spot to justify what they teach. Ofsted comes along and suddenly we only talk when our hands are up and we’re just talked at. So, does Ofsted know how we’re taught? Not really.

    And at the end of the day, what is the school being assessed on? Shouldn’t it be mainly on the standard of teaching? There are other important factors to consider, but schools are institutes where people are taught. And my school should be fully capable of passing inspections with its methods. We get more than our fair share of A’s and A*’s, so where’s the problem? Could it be that there is a formula that exists that guarantees an “Outstanding” in Quality of Teaching? Yeah, probably. And there’s the other problem. You shouldn’t judge a school on whether it follows x, y and z, but rather on the success of their methodology and how exam results come out, although even that’s problematic. (I’ve seen your video on the state of Physics education in schools, and I do agree about the way the market works and getting rid of it).

    This comment is long enough, but I do want to touch on one other point which I believe is key. There’s one part of the inspection where the school “randomly” selects students to fill in a survey to give opinions on the school. And there’s the problem. The results of such a survey are entirely skewed due to the simple fact that the students selected are more than likely to give a shining review of the school just because they’re doing well. Why doesn’t Ofsted make the selection, or heck, take the time to have all the school do it. Do it online with logins to help protect the school and inspection from students doing it more than once, I don’t know, just don’t have the school select the students to fill it in.

    I said “inspection” instead of inspection because really it’s a joke. My school won’t be inspected for another ten years, so who knows where the quality of education will go if it doesn’t need to care for so long. Schools should, at most, be given a couple days notice, even better if the inspectors turn up on the day. It should be over a long period of time, and the school should not have any involvement in terms of what students are talked to and who gives their opinions. There’s more to go into in this topic, and I’m glad you brought it up because it is something that p!**es me off about education, but I’ll spare you!

    Anyway, thanks for your time,
    -Tom

    Like

Comments are closed.