Policy Exchange has done well to make its report on Ofsted available on its website before plugging it on the media. As we might expect, it is incisive and includes home truths that will be as uncomfortable for those in charge of Ofsted and the regional inspection providers (RISPs), who carry out inspections, as for those of us who have been involved in them. 20 minutes, for example, is not long enough to make a valid judgement either on the overall quality of a lesson or on the quality of teaching.

Out of the four sub grades awarded by Ofsted, for teaching, achievement, behaviour and safety, and leadership and management, the judgement for achievement is usually identical to the overall judgement for the school. This last point is sensible – the purpose of schools is to maximise achievement, and everything else either contributes to, or detracts from, this.

The big question, though, is “Is this judgement accurate?” If it is, then this is the basis of fairness, provided it is also expressed in a professional manner, and followed by proportionate action, by which I do not mean immediate removal of a headteacher or senior staff other than in extreme cases. The same question can then be asked of each stage in the process, and it is right that it should be asked.

An inspector going into a classroom has that teacher’s life’s work in his or her hands, as well as the education of the children. Inaccurate judgements destroy teachers’ careers and cause stress-related illnesses that are both disruptive and expensive.

It is, in practice, more difficult to make a judgement on the quality of teaching than on other factors. It’s not difficult to see whether or not children are behaving themselves and working properly, or whether a school has its child protection paperwork in order, although the skills required to secure the former in difficult areas are of paramount importance and often understimated. It’s not hard to see if a school is run well or not – there are lots of sources of evidence for this. And for achievement, we have data.

Or do we? This is the key problem with the thinking behind this report, and with all of the thinking of Ofsted since Labour’s changes in 2005, which in turn were strongly influenced by the LibDem peer, Lady O’Neil. The authors see data as a reliable indicator of a school’s performance, provided it is properly understood.

They note, though, these reservations by Lord Bew:

  • that data could be actively manipulated
  • that data does not measure important things
  • that data does not sufficiently take account of school’s context

The final objection does not really concern data, but its interpretation. The first two, and particularly the first, are fatal. Here are some reasons why:

  • We do not have a clear baseline for children starting school. Interpreting guidelines in any way the user likes allowed Birmingham in the early nineties to claim that children in its most deprived areas were outperforming the national average at five.
  • Infant schools therefore have no baseline, and the need to protect young children from stress, so that data are collected by their own teachers, makes the schools judge in their own cause. The authors note Lord Bew’s point that this leads to impossible (my italics) targets for junior schools, a feature I saw last week. Even the phonics check has been affected by this.
  • Key Stage 2 SATs have been manipulated nationally to suit the agendas of government agencies, particularly Labour’s agencies, and are invalid. The English test is not properly marked according to the national curriculum. The maths test is properly marked, but does not include enough questions to test children’s mastery of basic skills. The previous science test allowed so much discretion in the amount of help provided to children that the results were nonsensical – nearly everyone passed. The Spelling, Punctuation and Grammar Test is objective, fair, and an important step in the right direction.
  • SATs do not provide a fair starting point for secondary schools, for whom the expected Level 4 in English indicates that a pupil can’t write properly. The artificial levels assigned to pupils who don’t meet Level 4 are even less reliable – the dishonest mark system gives some of these children are assigned Level 3 when they are barely working at the level expected of a seven year old.
  • GCSEs and A levels have been corrupted by coursework, dishonest “equivalences” and manipulation of marks.

So, the problem with basing inspections on data is that we have no honest and reliable data. Until we do, the approach can’t work.

The report raises big objections to the role of RISPs, an invention of Labour that allowed it to pretend to privatise parts of the system, while doing this in a way that let it dictate every detail of the companies’ organisation, including imposing illegal quotas on the recruitment of inspectors from minority ethnic backgrounds, and removing the notion that an inspectors needed to be qualified in the areas they were inspecting. This leads to ridiculous judgements that have lost Ofsted the respect of teachers.

RISPs themselves are criticised for lack of rigour in training and monitoring inspectors, which is fully justified – I have been on the receiving end of RISP training, and could not conceal my contempt for it – there are some good people involved, but too many ignorant cowboys. RISPs were designed to be personal ciphers for Sir David Bell as HMCI, and should be abolished.

The report repeats an error endemic in educational research, of taking data that does not quite fit – in this case data on lesson observation from the US – admitting its flaws, and then using it anyway. US systems are completely different from ours, and the amount a study cost ($50m) does not affect its relevance. However, the authors’ idea of a short inspection followed by a “tailored” inspection, with twice as many inspector days as at present, is a good one.

At the heart of the problem is the disastrous 2005 Education Act, which took away inspectors’ independence, and the time they needed to do their work properly. 20 minute observations, rightly criticised both in this report and, in my hearing, by senior serving HMI, date from this time. A provocative and interesting report, but its central recommendation will only work once the government has completed its efforts to clean up the corrupted data on which the authors wish to rely.