Archive for June, 2011

 

Friday, June 17th, 2011

A study apparently demonstrating the benefits of academy status seems to have been highly influential in recent weeks.

The research, by academics at the London School of Economics, was published in April. It has been picked up not only by Blairite commentators who backed the original academies policy, but now by the Department for Education in its push to encourage all schools to become academies.

I would also hazard a guess that it was in the mind of the Today programme presenter Sarah Montague when she asked a sceptical head teacher yesterday morning to accept the statement that academies improve schools’ results.

The research, by Stephen Machin and James Vernoit of the London School of Economics, produced some conclusions which look very positive for academies. As the Financial Times reported when the research was published, the study found that “turning a school into an academy improves its performance – and that of neighbouring schools”. The study was based on an analysis of pupil-by-pupil results of schools turned into academies under Labour, in the years 2002-9, when most of the institutions converting had low GCSE results. It includes a caveat that it does not relate to academies which have converted since the coalition came to power.

Having looked at this research in detail now, I am very impressed with a number of aspects of its methodology. Specifically, it performs statistical checks on institutional results which seem far more robust than similar exercises which have been carried out in widely-cited analyses of the academies policy in the past.

However, there is a gap in this research: any qualitative investigation into how academies opened under Labour have managed to produce their apparently impressive statistics.

This is an obvious question to ask: though academies’ benefits are often cited in broad-brush, quasi-ideological terms (such as allowing schools to break away from LA influence, encouraging innovation through a sponsor, or just simply promoting an often undefined quality called autonomy), why in detail would simply changing the structure of a school’s governance make a difference? What precisely have academies done to drive these results improvements? If they have greater independence, how have they used it and what has been the connection with results?

And once you look into that, as this blog and other research by the Civitas think tank has done, you start to have doubts over whether this policy is quite the panacea that is now widely being claimed.

OK, first the impressive bits, then. Well, for me if you want to know whether schools can improve their results by being turned into academies, and you want your research to have any claim on credibility, you have to do at least two things, neither of which seem to have loomed large in claims made about academy results in the past.

First, you have to compare like with like. Over the past few years, governments have looked at the GCSE (or equivalent, of which more below) results of academies, and compared them to those of the schools these academies replaced. On average, they have tended to find academy results improving, compared to those achieved in their predecessor schools, at least on the headline published figures, at a faster rate than those of the predecessor schools. Therefore, the argument goes, here is evidence that the academies policy is a success.

There are a couple of serious objections to any conclusions based on these calculations, though, including the following. What if the pupil clientele changed from the time before the school was an academy to now? The schools converting to academy status under Labour generally tended to have relatively large numbers of disadvantaged pupils. If the replacement of such schools by academies tended to draw in pupils from slightly less disadvantaged backgrounds – drawn, perhaps by the huge extra investment in new buildings that went with academies under Labour – with better results from their primary schools, this would be to the advantage of the academy but might mean that, when results rose, it was more to do with changing pupil intakes than anything the academy had done itself.

The Machin and Vernoit research tackles this issue by looking at the results achieved in key stage 2 tests by pupils who went on to attend schools which were to go on to become academies during the period under study, and those of children who joined the schools after they had become academies.

And the study finds that the pupil intake of academies did indeed “improve”. In other words, the academies under study were taking in pupils with better key stage 2 results than had been achieved by pupils entering the schools the academies replaced.

But here is the impressive bit: the researchers found that even after taking this pupil intake factor into account, the results achieved in the academies were better than achieved by a control group of schools.

The second impressive aspect about the study was that it sought to take into account the effect on neighbouring schools. This has always seemed to me to be important, since the success or failure of a policy should not be judged in terms only of the effect on an individual institution but in terms of its impact on an entire area: if an academy – which under Labour usually came with new buildings worth eight figure sums – succeeded only by drawing in more “educable” pupils from neighbouring schools, while those around it suffered and their results declined, this would raise questions about the policy.

But the Machin/Vernoit research looked at this issue, too. It found that neighbouring schools did suffer (to put it crudely) from the introduction of an academy nearby, in that the average achievement level of the pupils they recruited in year seven, as measured by their primary test results, fell.  In other words, some of the higher-achieving pupils moved, at the end of primary school, to the academy wheras in previous years they might have attended its neighbouring school. However, despite their intake getting “tougher” in this way, the results in these neighbouring schools at GCSE also improved. The paper suggests that this was probably the result of greater competition from an academy nearby spurring improvement, on the main results metrics, by the neighbouring schools.

Ok, that’s the good news. Here I come to my beef with this study. And I should say first that I am not trying to hit academies over the head for the sake of it with observations around strategies some schools might use to boost results. (The other day I met, as it happens, the principal of an academy with a very tough, non-selective intake in an area with grammar schools now under pressure from the Government’s new GCSE floor targets, and thought what a challenging, important job that must be.) But neither do I think that we should just abandon detailed scrutiny of whether academies are quite the answer to all educational problems that they are being made out to be, and what their results really tell us.

So back to the research. The trouble is, for all the statistical expertise and checking that has gone into this study, it is still based on the assumption that you can use a set of exam results formulae – on one or two performance indicators – to attempt to answer definitively the question as to whether these schools are actually providing a better education than their predecessors. In other words: the implicit view is that this question can be answered entirely statistically, without any reference to any qualitative understanding of what has actually happened to make these schools “better”.

Yet there are some fairly big alternative explanations. The obvious one is that academies have simply been more results-focused, in the main, than other types of school and thus they have sought to do whatever it takes to boost grades on the Government’s published indicators. That means that while the central indicators have improved, other indications – statistical or otherwise – might give cause for concern. So while the stats improved, actually if you tried to get a wider sense of what might be felt to matter in education, you would get a different picture. Academies might have, to put it more crudely as a hypothesis, paid more attention to gaming the results indicator system than other schools.

You could say it is unfair to single out academies in this way, and for newcomers to this blog, this might sound hyper-cynical. But, as I’ve written before, academies under Labour seem to me to have been under more pressure to raise results than other schools. Most of these schools were specifically created to address the claimed underperformance of a predecessor school. They came, often, with tens of millions of pounds of extra funding for new buildings. Their results were subject to extra scrutiny in the media, not just at the school level but at the level of the national politicians overseeing the academies policy, whose reputations were staked on headline scores improving. They might – though I am guessing here – also often come with a business mentality, reinforced by their sponsor, which incentivised senior leaders to get results up, come what may, through bonuses linked to GCSE exam performance. It would be surprising, then, if one or all of these factors did not produce a very strong focus on those headline measures.

So, how to check whether any other explanations lie behind those improvements cited in the study than just a general sense that education has improved in the academies under investigation?

Well, I have to confess here, that I have no killer line, or proof that this study is wrong in its conclusions. But I do think we should be wary about them. I want to come at this first statistically, and then anecdotally.

First, on the statistics, another impressive aspect of this research is that it does attempt to address, through the data of course, the most obvious way in which results could have been boosted artificially, if you like. This is through the use of non-GCSE qualifications.

Under the system in operation in recent years, other courses are counted as “equivalent” to GCSEs, for league table and results purposes. This is the case for the main measure used in this study: the proportion of pupils in each school achieving five A*-C grades at GCSE or vocational equivalent, including maths and English. Yet the fact that some of the GCSE-equivalent courses have been given high weightings in the results formulae – worth up to four GCSEs – and have high pass rates means that they can have a heavy influence on the overall published results. Schools encouraging high numbers of pupils to take these courses – whether they are doing so because of their own need to boost results, because of students’ needs or a bit of both – are therefore likely to get a results improvement out of doing so. Might not academies, then, under greater pressure to produce results gains, simply be turning to these courses to a greater degree than other schools?

So, back to the research. I was surprised to find that not only did Machin and Vernoit address this possible alternative explanation for the better results of academies, but that, when they did so, they found that it did not explain the results improvements academies seemed to show. In other words, the use of non-GCSE “equivalent” qualifications did not explain the relative success of academies, they suggest. The success, then, stood even after taking into account this possible alternative explanation.

The way they calculated this was fairly straightforward: simply to perform their calculations using GCSE qualifications alone as the measure of success in each school, rather than GCSEs “or equivalent”.

This, they say, represents their check on this idea – that I refer to above – “that the performance improvements [in academies] are largely driven by performance improvements in unconventional subjects”.

So, they conclude that putting pupils on “unconventional” GCSE-equivalent courses does not explain the academies’ results success. I should say, here, that I lack both the professional statistical expertise of these researchers or the time they no doubt spent on their study. But I would say that it is a slightly odd conclusion, given some other things we know about academy results, as revealed in more recent data sets.

First, I have performed a very crude version of a similar type of test to the one they used in their study, simply by looking at the latest published GCSE results of academies (all of them academies set up under Labour, and therefore the group from which the LSE study schools were taken) with “equivalents” and without. I have then compared these figures to those of non-academy schools.

I did this using Department for Education spreadsheets, adding up the number of pupils in academies in 2010 who achieved five A*-Cs including English and maths in GCSE or vocational equivalent, and comparing that to the total number of pupils in the academies they attended. The same calculation was performed to total up the number of pupils in academies achieving five or more GCSE A*-Cs when these were not allowed to include “equivalents”.

The figure for academy results – the proportion of pupils achieving five or more A*-Cs including English and maths with vocational equivalent, which was the main published measure used in league tables under Labour and is continuing to be the main target for schools under the coalition –  comes out at 43.3 per cent. Without them, it drops to 33.0 per cent, a drop of 10.3 percentage points.

Now, a similar comparison for non-academy schools reveals a far smaller gap. With equivalents, non-academies end up on a figure of 57.0 per cent. Without equivalents, they finish on 52.5 per cent. This is a gap of 4.5 percentage points.

So, on the 2010 figures, “GCSE-equivalent” courses have contributed far more to academies’ headline results than they have at non-academy schools.

Second, there is evidence from the Government’s much-debated new English Baccalaureate measure. This found, as I blogged about here, that nearly a third of academies with results to report had a score of zero per cent on the English Bacc, which records the proportion of pupils in each school with A*-Cs in English, maths, two sciences, a language and history/geography. Furthermore, the proportion of academies with that zero score on the EBacc was twice as high as it was with a comparison group of schools with similar intakes.

This data would suggest, then, that if academies were improving their results, they were not doing it exclusively on the narrowly “conventional” subjects that Michael Gove has chosen to highlight through the EBacc. Yet the LSE study says its figures do not show the improved results at academies are the product of gains in “unconventional” subjects. So, to repeat, it is strange how this evidence contrasts with the LSE research.

Other than the GCSE “equivalents” move, there are other strategies which can be used to boost school performance if schools of any kind are particularly desperate to see their statistics improve. These include entering pupils multiple times for GCSEs in English and maths in particular, with schools knowing that these are crucial to their published rates. The Advisory Committee on Mathematics Education documented this practice in relation to maths last month, pointing out that sometimes pupils would be removed from the subject by their school if they achieved a C grade earlier than the end of their course, to give them time to focus on other subjects important to the school’s results, even though the pupil might be chasing a grade higher than a C in maths (not important to the school’s published indicator). I have no evidence that this has happened to any greater degree in academies though, as I say, I think the pressures on most of them to improve results have been great. But any study should be aware that headline results indicators will often not present the whole picture of what has been going on in schools.

My final detailed response to the study is anecdotal. And here, I just want to refer back to my original blog on academies’ EBacc results, a couple of months ago, for evidence.

This made several points in relation to studies and anecdotes on the subject of history.

Academies were more likely to have fewer students studying history to GCSE than other types of school, according to research by the Historical Association. Academies were also more likely to have a two-year Key Stage 3, which gives pupils more time to prepare for GCSE but was a concern to the HA because it meant many were likely to lose one of the only three years they would study history at secondary school.

The report also quotes a teacher, from an academy, saying: “History is seen to be too academic! …Students who are predicted lower than a B are not allowed to study the course…We are also not allowed to run ‘entry level’ courses for students with specific needs, as that is not thought to be meeting the attainment targets for the academy.”

An Ofsted report on history teaching in primary and secondary schools, published earlier this year, also documented lower numbers taking history in academies. It found: “Entries for GCSE history from academies were significantly lower than for maintained schools overall.”

One online comment after a 2009 TES story documenting another academic report on the pressures facing history as schools sought to boost their results in league tables, ran as follows:

 “I used to work in an academy in London, and as I was leaving I had to rank every pupil in year 8 as an A, B or a C. A means that they could get an A or a B at GCSE. Therefore history appeared in their option forms. The B category were pupils who were borderline C/D. The C meant that they were predicted grades G to D. Neither categories B or C had history on their option forms! They were encouraged to take other less rigorous subjects.

“Even though I had known students previously predicted Ds and Es get outstanding results, who went on to do exceptionally well at A-level, and some even went on to do history at university.

“What was most upsetting was the case of one student, with a range of learning difficulties. He loved history, and orally he was phenomenal. He was put in category C, and was therefore being guided down a different pathway. He was devastated that he would not be able to take history in year 9-11. His mother rang the school, and explained that it was likely whatever course he was entered into, he would be unlikely to either pass or do very well in, so why couldn’t he at least take a subject he enjoyed?

“The plea fell on deaf ears and the boy was placed in some random BTEC or GNVQ course taught by some bland paper pushing academy drone who was being shipped in to ‘sort’ the school out of failing pupils and failing teachers.”

If you look back to my earlier blog, you will find reference to the parent of a pupil at a school taken over by the Harris chain of academies, who told me (and the local paper) that her daughter had been forced to take a BTEC sports course (worth two GCSEs to the school), at the expense of French GCSE, despite her daughter having no interest in sport. This was a clear case, said the parent, of the needs of the school to boost its published results taking precedence over those of her daughter.

So in response to this LSE study, I have put forward some statistics that run contrary to one of its more important findings, and also some anecdotes.

Not much, you might think. But there is a bigger point here: there should be more to the evaluation of a policy than simple results statistics, however clever the methodology and however robust the statistical cross-checks, especially in a complex system such as secondary schools results calculations which offer plenty of opportunities for schools to take tactical decisions to boost results. This runs the risk of following less-than-ideal behaviour, from a pupil’s point of view, within particular subjects.

And is all that matters the number that appears at the end of the educative process? Or do we care about what happens along the way, and how the numbers are generated? If particular subjects have been affected in the drive for higher results, should an influential study like this not be investigating and having something to say on this? Or should such a perspective just be ignored: the idea is that we lay down the statistical rules for success, check whether the statistics have been raised and that, apart from some clever checking of data, is pretty much it?

To sum up, how do we know that academies under Labour did not simply pursue a more relentlessly focused version of “Education by Numbers”?

I think if researchers are going to make claims which are going to be used, whatever the caveats in the original research, by others to say categorically that a policy “works” and by implication that the education on offer in academies is better in a general sense than in other schools, they are going to have to be prepared to dig a little deeper – and not just statistically – into what has been going on behind the figures. Economists who do not do this will never be able to see or pronounce on the whole picture, I believe. Their research will therefore always be incomplete.

So it is a shame that statistics are simply being held up as conclusive evidence, one way or the other. This really is not, I think, for all the complicated formulae and technical expertise on display in this paper, a very sophisticated way of understanding what has really been going on in our schools.

- Warwick Mansell

2 Comments
posted on June 17th, 2011

Wednesday, 1st June, 2011

Right, I haven’t blogged for a while, but thought I’d just post here an extract from a speech I made just after Christmas about what can be read into English Sats results for 11-year-olds.

I’ve been prompted to do this after reading, over the last two days, the Evening Standard’s coverage of what it claims is a literacy crisis in London.

Yesterday, part of its front-page coverage talked about one in four children being “practically illiterate”, seemingly based on the proportion of pupils achieving level 3 or below in English Sats.

Today, it highlighted the number of pupils “with a reading age of seven”, based I think on the numbers achieving level two or below. (The normal level said by the Government to be the expected reading standard of a seven-year-old).

I don’t think test statistics can support the interpretation being put upon them. It may be that we have a literacy problem in the capital, or in the country as a whole. But the test data used as a good part of the news hook for the coverage don’t do a good job about telling us the nature of the problem. It’s probably not helped in that news coverage often fails to put the numbers in perspective. Ideally, it would give  us unsensationalised  information on whether the statistics are on an upwards, downwards or static trend, and what information we have about how this country compares to others, but this tends not to happen.

Anyway, here’s the extract of that speech, prompted in part by similar coverage on the Today programme before Christmas.

I want to talk about the over-interpretation of test results: they don’t tell us nearly as much as we might think they do. Perhaps just as importantly, we don’t use the data, in our public debate around education, really to understand what is going on in schools or with pupils’ learning, and in that sense we are letting children down because we should be using assessment information in a far more sophisticated way, I think. And bear with me, as I am going to have to go into a bit of detail here.

So, I’ll just start with a question: What is the definition of the level of understanding expected of an 11-year-old in reading? How is this defined by the government, by the media, and thus by people nationwide in the debate about this vitally important subject?

What does it mean, within the detail of what children have to achieve, for them to perform at that level?

Well, in 2010, it came down to this: the ability of a child to score 18 marks out of 50 in a one-off 45 minute test, taken by most pupils as they come to the end of their primary school years.

That is the number of marks needed to secure level four in reading, the Government expectation, and represents the entire official judgement on that pupil’s ability in reading over the past four years.

If a child scored 30 marks out of 50 in last year’s tests, they would have achieved a level five in reading, which statistically and according to the interpretation we are expected to put on these data, is the level of proficiency expected of a 14-year-old. If they scored between 11 and 17 marks, they would be at level 3.

That is it. Nothing else counts in official estimations of what it means to be able to read. Our entire primary education system –at least so far as reading is concerned, hinges around the proportion of pupils achieving these expectations and pass marks, which are very closely bunched, in a one-off test one day in May.

I highlight the case of reading because it came up in coverage shortly before Christmas by the Today programme. It led its broadcasts one morning with claims that “thousands of boys start secondary school only able to reach the reading standards of seven-year-olds or below”.

This was based on a technically accurate interpretation of figures generated by national test data, but which led me to question why people are putting such huge weight on figures which, if you step back from this for a second and think about the detail of what these data mean, cannot support this interpretation.

Today had obtained figures – released every year – which showed that in 2010, 10 per cent of pupils obtained below a level three in the reading test. This means that they either scored 10 marks out of 50 – 20 per cent – or below on the tests, or did not even take them.

The logic of Today’s argument was this. Pupils scoring below level three in the reading test have scored level two at best. Level two is the performance technically expected of a seven-year-old in the tests pupils take at this age. So the 10 per cent of boys failing to achieve level three are performing at the level expected of a seven-year-old.

This finding, which suggests a serious problem – implying, I would venture, to many listeners, that many boys are wasting years at school making no progress – is viewed as a national scandal; it, at least, is very serious for these boys.

Consider, though, more detail on how these data are generated. A child could fail to achieve a level three with 10 marks out of 50. But with another eight marks – 18 out of 50, or 36 per cent, these boys would have achieved a level four, in line with government expectations of an 11-year-old.

The difference between having the reading age of a seven-year-old, then, around which national debate centred, and that of an 11-year-old turns out to be eight marks on one 50-mark test. Put it another way, a seven-year-old who took this reading test could have scored 10 marks and be said to be performing in line with expectations for their age.

If they took a similar test four years later as an 11-year-old and scored 18 marks, then they would be deemed to be doing as well as expected for an 11-year-old. Thus, four years’ progress in reading could be said to come down to the ability to improve by two marks a year in a 50 mark test.

I went into some detail in this example to illustrate the difficulties we have in the way test data are being used. Believe me, I am not trying to minimise this problem: if a large number of boys really cannot read, it is a serious national issue.

The trouble is, I don’t think test data, and in addition to a certain extent the way they are reported, are helping us understand the nature of that problem and thus to do something about it.

Consider again the interpretation of the figures around which the Today programme that morning revolved, including an interview with Michael Gove, the Education Secretary.

The lead headline on the programme’s website read “Gove: 11-year-old illiteracy ‘unacceptable’”. John Humphrys, the presenter, also used the term “illiteracy”.

But, in fact, the test data actually tell us nothing about “illiteracy”. They don’t tell us whether the number of boys quoted in the programme actually are “illiterate” – can’t read or decode text – or whether their problems are different from that.

Strictly, they tell us only that a number of boys either couldn’t score a certain number of marks in a one-off reading comprehension test (further scrutiny of the government data shows 4 per cent were entered but didn’t achieve level 3), or their teacher did not enter them for such a test because they believed they would not pass (5 per cent), or that they simply missed the test(1 per cent).

We don’t know, then, from the test data, whether the problem with these children – is a) genuine inability to decode text – although the fact that nearly half of them scored some marks on this test would suggest this was not the issue for these children b)problems with reading for comprehension (ie they can actually read the words, but they don’t really understand either what they mean or what the question is asking) or c) a failure to cope with the format of being tested.

There is, of course, another explanation: that these children scored below their “true” level of understanding through having an “off-day” or just being unlucky: there will always be measurement uncertainty and inaccuracy in a one-off test.

If we don’t know what these figures actually mean, how can we do anything to help children to genuinely improve? Is what the nation needs a greater emphasis on helping children with decoding, as is suggested through the introduction of a new phonics test, or more work on comprehension, for example? The test data give us no answer.

It also was not reported – and generally isn’t – that we do know that substantial numbers of pupils in this category of failing to reach level three have special educational needs: by my calculations from government data, seven in 10 children who failed to reach level 3 in English in 2010 were classed as having a special need. Nearly 10 per cent of those failing to reach level three are classed as autistic; a further seven per cent have specific learning difficulties; 10 per cent have communications needs and a further 10 per cent have behavioural, emotional and social difficulties. None of these figures were presented in the Today programme reporting.

Neither, by the way, was any international context given: boys’ reading is a problem around the world, as last month’s Organisation for Economic Co-operation and Development PISA study showed. It included the following quote: “Across OECD countries, 24 per cent of boys perform below level 2[at the bottom of six levels of the PISA reading tests], compared to 12 per cent of girls. Policy makers in many countries are already concerned about the large percentage of boys who lack basic reading skills.”

The fact that these test data – and sometimes the reporting around them – allow us only a very superficial, decontextualised understanding means that we really are letting down pupils and the education system as a whole.

We could do so much better. For not only does the accountability system which centres on pushing schools to raise these test numbers – through league tables, targets, Ofsted inspections and the rest of the centrally-created performance apparatus – encourage schools to spend months trying to drill pupils to achieve, if you look in detail at what the figures mean, a few extra marks on one-off English and maths tests. We also lack the understanding – in terms of the national data that these test figures generate – both to help these pupils do better – ie to work out what it is they can and cannot do – and to help the system as a whole to improve.

Today presented the problem as a hugely serious issue for the nation. But we are not taking it seriously at all if this is the level of analysis being offered.

If Sats are the height of our ambition in assessment – and there are still signs, under the new government that this is what pupil progress will revolve around – we really have a problem, then. We need to look at the use of much more sophisticated and useful measures of children’s understanding, both from the point of view of helping the individual child improve, and from the point of view of getting a much better understanding of what is really happening nationally.

 The rest of this speech, to a meeting held in Parliament in January to launch a joint Association of Teachers and Lecturers/National Association of Head Teachers/National Union of Teachers pamphlet on assessment and accountability to which I contributed, went on to talk about problems of the washback effect on teaching of high-stakes test-based accountability, with which readers of this blog will be familiar.

- Warwick Mansell

1 Comment
posted on June 1st, 2011