Archive for July, 2011

 Friday, 22nd July, 2011

The contradiction is, to this observer, breathtaking.

Last week, the Government said this: “Too many of our public services are still run according to the maxim ‘the man in Whitehall really does know best’…The idea behind this view of the world – that a small group of Whitehall ministers and officials have a monopoly on wisdom – has propagated a lowest common denominator approach to public services…”

“People should be in the driving seat, not politicians and bureaucrats,” said the Government, in its “open public services” white paper.

On Wednesday, it announced new decisions on what is to count in school league tables which clearly embody a view that, yes indeed, ‘the man in Whitehall really does know best’: certain qualifications are to be seen as valuable – no matter what pupils, teachers and parents think of them – and certain others are not.

These ‘others’ will continue to be funded by the Government, so that state schools will receive cash to offer them. But the results they generate are not to be published at the school level, because to do so would be to encourage schools to offer them. And the Government wouldn’t want to do that.

Confused enough yet? Well, I must admit that the latest developments on league tables have even this perhaps obsessive chronicler of their many twists and turns scratching my head.

Now, come back to that public services white paper. Now I am very sceptical about this document, having blogged about it here. However, it is useful in one sense as a reference point for a world-view being put forward by the coalition which certainly, I thought, had the benefit of clarity.

An idea which is central to the paper, and much other policy which has emerged from Government in the past year, is this concept of “transparency”. The argument runs as follows.

Whitehall collects huge amounts of data, across all public services. Ministers want to release as much of it as possible.

This would have two benefits, the theory went, I thought at least until Wednesday.

First, “transparency” is a good in itself. The public have a right to know as much about the public services they fund as is possible to provide, so releasing more and more stats on the different qualities of institutions must be a good thing.

Second, one of the key aims is to promote choice and competition. In education, by providing more and more data, the idea is that people get a more and more detailed idea of the quality of each school. In doing so, they get the chance to make more effective choices. And, is the implication, this forces schools to have a more and more tightly-defined regard for what the “consumer” – the parent or pupil – wants, and so, it is argued, the quality of education provided must rise.

An intriguing twist on this argument is that, third, in releasing huge amounts of data in different categories, effectively the Government democratises the use of these statistics for accountability purposes, the argument runs. In the old days, it is claimed, schools were judged by just one or two results formulae, laid down to very tight specifications by civil servants and ministers, meaning that they worried most about performing to goals which had been set for them not by the public, but by bureaucrats.

Now, with schools being able to be judged in any number of ways with the user of the service choosing what matters most to them, the entire process has been devolved, with accountability resting where it should: between institution and user, rather than between institution and policy-makers.

Much of this, I think, is actually very contentious and I hope I have questioned much of the above at some point or another. But it does at least have the virtue of being reasonably internally consistent; indeed, some would say that it is too simple, and too ideological. But, as I say, it is a view.

So, the Department for Education press release yesterday began: “The Department for Education today announced that only the highest quality qualifications will be included in new, transparent school league tables.”

GCSEs and iGCSEs would be included, but other qualifications would have to pass some kind of quality check in order to be released for publication, even though these latter qualifications would continue to be taken in schools and colleges, and funded by the state.

Um, so that would seem to violate the first principle that I thought the coalition’s reforms in this area were based on: complete transparency. If the government had data on something going on within a school, I thought the idea was that it would release it to the public.

Yet here we have a government which says it is committed to transparency seemingly, mind-blowingly perhaps given what I thought its philosophy was, proposing that pupils take a set of qualifications whose results will then be… kept secret.

Nick Gibb, the schools minister, is quoted as saying:  “Parents want more information so they can judge schools’ performance. The changes we have made mean that parents will have a complete picture of their local schools so they can choose the right school for their child.”

Eh? No they won’t have a complete picture. It’s only “complete” if you believe that the non-GCSE courses which are now longer featuring in the rankings, but pupils will continue to take, are not any part at all of what counts in a school. That’s a value judgement made by a minister, rather than coming from decisions at the school or family level. (I also wonder exactly what evidence there is for the at-face-value plausible assertion that “parents want  more information so they can judge schools’ performance”, but that’s another matter…)

And of course, just as strikingly, this move violates the third seeming principle, that publishing more data allows the public to judge what it values within what an institution provides, rather than the state. Yet here, the state is laying down exactly which qualifications are to be seen as high quality, and which are not. It is the officials and the minister – Nick Gibb is the one quoted in this release – who are acting as if they have the “monopoly of wisdom” here. For, if the individual chooses to work for a qualification which does not feature in the league tables, pupils, advised by their parents, will demonstrate that they believe it has some value. Mr Gibb and his advisers appear to be keen on telling them that they are wrong. It is a very un- free market approach, and very un-Tory.

So, as I say, my head is spinning with all of this. I can’t quite understand why a policy has come about which is so at odds with what I thought was the over-arching philosophy. However, I thought I would venture a couple of possible reasons.

The first is reasonably simple: the juxtaposition of this policy and the transparency/democratisation of accountability philosophy might not make much sense, but both of them potentially play well with the media, so, in the policy-maker’s mind, why not go for it? The “transparency” argument above will be accepted by many people, while the belief that this will provide headlines suggesting that ministers are getting tough with  “dodgy” vocational qualifications also appears to have paid off in some newspapers. It’s a win-win, and while this might make for contradictory policy-making, who will notice?

I think that’s only part of the answer, though. The second explanation is a clear implication of the press release: the Government simply is going along with the finding – in the report on vocational qualifications by Alison Wolf which lays the groundwork for these changes – that pupils have been incentivised to go for certain vocational courses not because of the worth of the course to the individual, but because of their high rating in league tables for the school.

That’s an argument I’ve been making since at least the time my book came out, of course. And yes, this is indeed a side-effect of the current league tables. (The press release amusingly says “the Wolf Report demonstrated that the current performance table system creates perverse incentives,” as if this had been in doubt beforehand, or as if any performance table system would not create some kind of side-effect).

The TES rightly points out, on its front page today, that this is the final confirmation that the contribution of non-GCSEs to headline “GCSE” measures will be capped, which must be a correct decision, given the way “GCSE” league table findings are interpreted by the public and given the perverse incentives which have existed up to now.

But otherwise the remedy to this problem is bizarre. This particular perverse incentive was created not because of the mere existence of vocational qualifications in any league table ranking (though all league tables will create perverse incentives), but because some of them were – seemingly, to this observer – so over-weighted in the central indicators that there was a huge incentive for schools to push pupils towards them, with the need of the school to raise its scores at least a large part of that calculation in many cases.

If the Government changed this so the results for particular individual non-GCSEs were simply published separately alongside each GCSE in the tables, this particular perverse incentive, I think, would have gone. Although schools would still have to think about success rates for every type of individual qualification they entered – which can be a perverse incentive, I think, in that I don’t think a pupil wanting to take a course should be pushed away from it just because they are unlikely to get a C grade, though I guess some teachers would dispute this -  at least schools would be relatively free to opt for courses they, and the parent and pupil, potentially valued. A course could still have its results published, but if parents and pupils did not value it, the “market” would kill it off in a way that might not have happened under the old system, when schools were incentivised to push pupils towards particular courses with high league table weighting.

As we are, the new league tables will not neutralise the incentives on schools to push pupils towards particular qualifications because of the benefit to the school, rather than to the pupil. It simply changes the type of qualifications which might be favoured, based on the  “wisdom of Whitehall ministers and policy-makers” as to which type of courses should be favoured. To put it another way: ministers don’t seem to like qualifications assessed entirely through coursework, and I would agree that it is tough for these courses to co-exist and have credibility with a high-stakes accountability system in which teachers are being held to account for the results.

But simply removing such courses from high-stakes accountability, in the sense that the results of these qualifications are not published themselves, does not remove them from the effects of league tables. Ministers are incentivising schools to move away from them. Is this the best move for the child? Again, schools are not able to that decision from a neutral perspective, because of the mechanism of high-stakes accountability.

To put it another way, the press release says: “Teachers will still be able to use their professional judgement to offer the qualifications which they believe are right for their pupils.” But this will, still, clearly be influenced by league table considerations, as, mind-blowingly, even the DfE knows, since it also says in the press release that its league table changes will “ensure that schools focus on valued qualifications”. (Sorry, I made the mistake of looking at the press release again; it’s not good for my head).

So, ahem, clearly I still think there are fundamental problems with league tables and results pressures at lots of levels. But I’m especially surprised the government did not come up with a solution which at least is a better fit with its own logic. Not to have done so either smacks of the thinking behind the rankings getting so complex that everyone gets confused, or of a belief that the incentives within league tables can be harnessed for the greater good, even despite the clear inconsistency. Some will also claim that there is a vindictiveness in ministers coming out against non-GCSE exams and those which are assessed by the teacher, although I am not sure about this verdict myself: some of the arguments within the Wolf report about needing to look properly at the quality of courses towards which “non-academic” pupils are being pushed are powerful. But then again, if these are not good courses, why should the Government continue to allow them to be funded in state schools?

I wonder if this is not also another example of that mad pendulum swinging in education policy: Labour worried that league table pressures would push schools away from vocational courses unless they were given an incentive not to do so – so it over-incentivised them – and the Tories respond by using one of the easiest levers they have to pull – how schools are held to account – to wipe many of them off the official map of what counts.

The Government’s move may have demonstrated something useful, though. Sometimes, to hear supporters of league tables talk, publishing data is both a largely “neutral” act of transparency, and almost inevitable. To open up the statistics is simply a matter of letting in some sunlight into previously obscure areas of school practice, one would think sometimes from listening to the advocates of this movement.

In reality, the choices the Government makes as to what is measured, how, and what data is released, are hugely important. It remains, in this sense, a very centralised system, which I suspect is why civil servants and ministers like it so much. It clearly drives how schools act. But in this sense, it is not a “neutral” act, to be judged on the criterion that you either like the idea of releasing more data, or you don’t. Politicians should be held to account for the effects of their moves on data, not just the existence of them. To the extent that this move will stop pupils being pushed towards non-GCSE courses because of the high-equivalence factor, the politicians should be praised. But believe me, the problems and injustices are not going to go away, and the inconsistencies here are now glaring.

One final point. The  press release also says: “The 2011 school league tables will highlight the performance and progress of disadvantaged groups compared with other pupils. This will create a powerful incentive to narrow the gap in achievement between richer and poorer pupils.”

I know what the thinking is behind this, and on the surface it seems commendable. But I can’t help cringeing when I read it. There’s a sense of a teacher reading this and thinking: “Oh yeah, helping disadvantaged kids do well. I’d never thought of that. I just wasn’t bothered before. Now, thanks to your wisdom, Mr Gibb, in putting another column on the spreadsheet by which I’m judged, I realise the error of my ways. I’ll try harder now. Thanks.”

I know one of the most persistent debates within education is about many teachers not having high enough expectations of pupils from tougher backgrounds. But it strikes me that if the main way of tackling it is for ever-closer monitoring through the statistics generated at the end of this process, we are in danger of missing an awful lot of tricks.

If the Government isn’t, through the way it trains teachers, develops them in the classroom, the messages it sends them in its rhetoric and the support it provides to improve the quality of the educational experience for all pupils and particularly those from disadvantaged families, trying to promote this, I don’t know what it is doing. If it had done that, and then effectively argued that teachers still need an “incentive” at the end of this process, you have to ask what has gone wrong along the way.

-There are other things to comment on in the league table announcements – including the fact that pupils taking the EBacc appear now to have to take seven GCSEs, raising question marks over how  much time they will have to study much else, and the contentious, for me, move from contextual value added to value added and unadjusted progress measures as the main “fair” ways of judging schools – but I seem to have run out of space and time today….keeping up with developments in the rankings is a full-time job, it seems….

- Warwick Mansell

6 Comments
posted on July 22nd, 2011

Tuesday, July 19th, 2011

Well, I said at the end of the last blog that I’d be writing something imminently on the relationship between the Bew assessment review and the government’s ongoing national curriculum review. Here, slightly earlier than planned, is what I had in mind.

This week’s Government response to the Bew review into primary assessment could be redundant within just over two years.

That is the implication of comments made by a leading figure within the test regulator Ofqual at a conference on the national curriculum I attended on Friday.

Stephen Anwyll, Ofqual’s head of 3-14 assessment, said the long-term future of testing would be “up in the air” until after the outcome of the current national curriculum review was known.

Assessment arrangements in primary and secondary schools would have to be “completely revised” if the review led to a fundamental rethink of what schools teach.

The standards pupils achieved in any national assessments created as a result of the national curriculum review might also not be comparable with current performance, he said, since measurement would need to be “recalibrated” as assessments changed.

Mr Anwyll also suggested there was a contradiction between the Government’s suggestion, in its remit for the curriculum investigation, that it should not cover assessment and the detail of what it was being asked to look at. “You cannot separate the curriculum from assessment,” he said.

English, maths, science and physical education aspects of the curriculum for 5- to 16-year-olds are due to be revised for first teaching from September 2013 following the curriculum review, which is expected to produce first recommendations by early next year.

Speaking at a Keele University Centre for Successful Schools conference last Friday, Mr Anwyll talked about Ofqual’s detailed work to be carried out in response to the Bew review.

He then added: “Sitting beyond all of this, in the slightly longer term…all of this is up in the air depending on the outcome of the national curriculum review.

“If we are talking about, actually, a new programme of study, in the first instance for English, maths and science, which we are expecting to see some examples of this year, that could change the entire picture.

“If you reform standards as part of the national curriculum review, it’s ground zero again; you calibrate  the standards from there- you cannot start comparing to previous standards.”

He added: “National curriculum assessments are excluded from the remit of the national curriculum review.

“But if you look at what’s included in the remit, it includes whether the national curriculum should be set out on a year-by-year basis, what should replace existing attainment targets and level descriptors to define better children’s standards of attainment, and what’s needed to provide expectations for progression to support the least able and stretch the most able.”

“All of these are absolutely fundamental to assessment, so you cannot separate curriculum from assessment.”

That comment appears to echo a statement by Sir Jim Rose, leader of England’s last curriculum inquiry, carried out in the dying days of the last Labour government. He was barred from considering assessment but said this was the “elephant in the room” when he visited primary schools.

Mr Anwyll added: “Much of what we do currently will have to be completely revised if we get a new national curriculum, new standards defined, and new ways of measuring them defined.”

That’s the newsy bit; the below is comment from me:

This notion of a contradiction between a national curriculum review which is supposed not to be looking at curriculum matters, and in practice it being impossible for a review of this type not to have serious implications for assessment was underlined this week in the Government’s response to the Bew report.

The remit for the national curriculum review says: “The review itself will not provide advice on how statutory testing and assessment arrangements should operate”.

Yet this week’s Government response to Bew says: “The national curriculum review will consider the suggestion from Lord Bew and the panel for statutory assessment to be divided into two parts….”*

It also says: “The National Curriculum Review will consider how we report statutory assessment in the long term.”

Hmm.

The full sentence of that quote above about statutory assessment being ‘divided into two parts’ is: “The National Curriculum Review will consider the suggestion from Lord Bew and the panel for statutory assessment to be divided into two parts in the future, with a ‘core’ of essential knowledge that pupils should have learnt by the end of Key Stage 2.”

This looks to be a suggestion that some “basic” skills literacy and numeracy tests be introduced at KS2. It is building on a somewhat mysterious idea flagged up near the end of the Bew report, which I blogged about here. One to watch, I think, and not only by people who wonder at the polarising language of “knowledge that pupils should have learnt”….

*I was reminded of this nugget of info via Helen Ward of the TES on twitter.

- Warwick Mansell

No Comments
posted on July 19th, 2011

 Monday, 18th July, 2011

This is just a quick reflection on union reaction to the Government’s proposals on the future of assessment at Key Stage 2.

Ministers published today their response to last month’s final report by the Bew inquiry into this subject, the review which itself was triggered by last year’s Sats boycott by the National Association of Head Teachers and National Union of Teachers.

The unions’ reaction is interesting: four different associations produced arguably, three or four different positions in response.

This could be viewed as surprising, given that, for all the changes put forward in Bew, the fundamentals of the high-stakes testing regime remain in place, despite widespread concerns within the profession. Or it may simply reflect a beneath-the-surface belief that, whatever the problems with current structures, essentially the basics of the system are in the end unchallengeable, and therefore the argument must be confined to the detail as to how it works.

In terms of their reaction to the Government response to Bew, the heads’ associations were more upbeat. Perhaps unsurprisingly, the National Association of Head Teachers, which called off the possibility of a repeat of last year’s boycott in 2011 in return for the Bew inquiry, and which was allowed to recommend head teachers who would sit on the Bew committee, was broadly positive about its outcome.

Its press release was headlined: “Bew recommendations are a significant step forward towards fairer accountability system, say school leaders.”

But the NAHT said the Bew recommendations – every one of which has been accepted by the Government (always an interesting development for any inquiry which is billed as independent from ministers, I feel) – were only a “first, positive step on a long journey towards a system which reflects the achievements of all pupils and the contribution of all schools”.

Longer-term goals included a far greater role for teacher assessment and more trust in the profession, and the NAHT said it would be on the look-out for ministers breaking with the spirit  of the Bew recommendations.

The Association of School and College Leaders was also accentuating the positive, headlining its release: “KS2 assessment moving in the right direction.” I will come back to this.

Both the National Union of Teachers and the Association of Teachers and Lecturers were less optimistic.

The NUT argued: “The positive steps in this review will be undermined by keeping in place school performance tables, despite the fact that the majority of those who gave evidence called for their abolition.

“While league tables exist, teaching to the test and a narrowing of the curriculum will remain…The Review and the Government should have been bolder.”

The ATL said: “There is some good news in the government’s changes to key stage 2 testing, but so much more could be achieved if the government was not insisting on remaining judge, jury and executioner of schools by setting targets, closing schools, and forcing through its naïve free market policies on academies.”

I haven’t received a press release from the National Association of Schoolmasters Union of Women Teachers, but we know that that union has long favoured tests over teacher assessment, amid concerns about the effects of TA on teachers’ workloads.

For me, having looked at – and written about – the changes proposed by Bew (blog here), this feels like a very muted end to what has been years of pressure building on ministers over testing: the NAHT itself conducted a review into the architecture and effects of the current system, to which I contributed, and which dates back to 2007. Part of that pressure was exerted, amazingly perhaps as it appears now, by Michael Gove when he seemed to accept, in 2009, that test-driven teaching can be bad for children’s education. It also built through the testimony of the unions, subject associations and reports from organisations such as the Children, Schools and Families select committee, the Cambridge Primary Review and the Children’s Society’s Good Childhood inquiry.

One could look at the positive reaction with which many teachers are likely to greet the move, recommended by Bew and accepted by ministers, that the current writing composition test in KS2 English is replaced by teacher assessment, and take a different response to the quick verdict I’ve offered above, of course.

Or, for critics of the high-stakes regime, there is the fact that, since 2008, the following Sats tests have bitten the dust: English, maths and science at KS3; science at KS2; and creative writing at KS2. This might be considered a good outcome of all that pressure.

But, on the negative side, Bew put forward, shockingly, I think, the unbalanced assertion that “strong evidence shows that external school-level [presumably statistics-based] accountability is important in driving up standards”. And the essentials of our system – that test and exam results will remain the main mechanism by which both secondary and primary schools are held to account, with high stakes including closure to follow for “underperformers” – remain unchanged.

The underlying argument must be that this high-stakes system has been good for English education, and that it is a key to continuing progress in the future. If this were not the underlying assumption behind Bew, we would not be proceeding on the current basis, for it provides no fundamental attempt to re-engineer assessment and accountability so that the system gets the accountability it needs without the knock-on washback effects on teaching and learning.

As ever, the basic architecture of test- and exam-based accountability seems to be the unalterable fact of education in England, to which everything – including, I’m afraid, a fair-minded and rigorous consideration of its overall effects on children’s education – must come second. More than 15 years after the introduction of national testing in England,  there has still been no detailed Government inquiry into the nature, extent and effects of test-driven teaching in this country: how many schools go in for it, the detail of how children’s learning is affected and what pupils alongside teachers think about it. This is astonishing, really, if you believe that the quality of the child’s educational experience is to be looked after above all else.

Just finally, I want to return to ASCL’s position, which I think is the most curious.

Brian Lightman, ASCL general secretary, is quoted in its press release as saying: “There must be a robust but fair process of assessment for pupils as they move from primary to secondary school. This is important not only for pupils and their parents, but also so that their new schools have accurate and reliable information about their level of progress.”

I find this statement, which reflects what has been ASCL policy for a while, strange because of the contrast with the somewhat ambivalent relationship secondary schools have with KS2 assessment data, as documented in the Bew report (and elsewhere).

The final Bew report says: “We have heard widespread concern that secondary schools make limited use of the information they receive about their new intake. Many secondary school respondents have expressed concern that national curriculum test results or primary schools’ teacher assessment are not always a suitable proxy for the attainment of pupils on entry to Year 7.”

If many secondary schools don’t trust pupils’ Sats results (or test-influenced TA judgements), why does ASCL want them retained as “robust but fair” measures?

I’ve not put this to ASCL, but I believe the answer is that the union, while it doesn’t particularly trust Sats results as measures of pupils’ underlying understanding, doesn’t want them replaced with teacher assessment because secondary heads worry that primary schools would inflate TA judgements. This would leave secondaries’ results looking less good, because it would mean pupils would appear to be making less progress at secondary school.

So while secondary heads might have reservations about the value of the data provided by Sats, the implications in terms of the accountability system for them mean they back them. As usual, the demands of the accountability system, then, seem to trump other concerns.

This may be a scandalous explanation for ASCL’s position on this issue, but I am struggling to think of another one.

All of which leaves me slightly saddened. There is still an awful lot of evidence that this system is not serving at least a large proportion of children’s needs well. It is a shame that the unions have not seemed able, in the end, to come together to continue pressing home that point.

- Is this really the end of the story for assessment at KS2, though? The current national curriculum review may throw things up in the air again. I expect to write more about that in the next few days.

PS: It is interesting to play “spot the difference” between the stated purposes of national assessment, as laid down by the Bew report, and the previous attempt at this, by the Labour government’s “Expert Group” on assessment, which took in the fall-out to the 2008 Sats marking crisis and reported to the former schools secretary Ed Balls in 2009.

Bew lays down three main purposes of statutory end of Key Stage 2 assessment data as follows:

a Holding schools accountable for the attainment and progress made by their pupils and groups of pupils.

b Informing parents and secondary schools about the performance of individual pupils

c Enabling benchmarking between schools, as well as monitoring performance locally and nationally.

The “Expert Group” report came up with the following definition of the purposes of “assessment”, up to the end of Key Stage 3. It came up with four:

-To optimise the effectiveness of pupils’ learning and teachers’ teaching.

-To hold individual schools accountable for their performance.

-To provide parents with information about their child’s progress.

-To provide reliable information about standards over time.

As you can see, Bew’s top two purposes are extremely similar to the 2008 report’s purposes two and three. The 2008 report’s purpose four is a subset of Bew’s purpose three. The largest difference between the two is that the first purpose mentioned in the 2008, which in my view is correctly placed at the top of the list, does not feature in Bew’s list. Otherwise little, it seems, changes.

- Warwick Mansell

1 Comment
posted on July 18th, 2011

Sunday, July 3rd, 2011

I should begin this blog post with a note of slight regret. It gives me no pleasure to be writing something which is critical of the Bew report, especially given the courtesy with which Lord Bew treated me in giving evidence to the review. He invited me to do so, and even wrote me a handwritten note to thank me afterwards.  The review’s interim report, published in April, was, I thought, a largely impressive synthesis of evidence on this subject which gave me hope that, whatever the outcome and whatever the constraints of the remit, the issues would be given a thorough and fair weighing in the final report.

Yet, I am afraid, despite some impressive passages, the report really does not do justice to this, I think, incredibly important subject.

I say this mainly for three reasons.

First, the report fails to follow through on what is said, at least in the foreword to the report, to be the first priority for the assessment and accountability system: ensuring that such a system supports children’s learning. Second, it misrepresents the evidential position on the effects of test-based accountability in a fundamental way. And third:  it does not address in any meaningful sense a central criticism of test-based accountability: that test results are being used for too many purposes and that key purposes can be at odds with one another (my italics, since this was the bit that was not meaningfully considered).

To deal with the first problem, Lord Bew says in the report’s foreword:

“We would like to be quite clear that throughout this process we have always focused on how best to support the learning of each individual child.”

If this had been the overall goal of the review, I would say “fantastic”. The trouble is, having set this up as an aim in the foreword, this approach is completely absent in the report, where the quality of the learning experience resulting from accountability – what, if anything, is happening in lessons as a result of test-driven accountability? – really gets only glancing consideration.

This becomes clearer when we look at the report’s consideration of evidence.

The report says: “Strong evidence shows that external school-level accountability is important in driving up standards and pupils’ attainment and progress. The OECD has concluded that a ‘high stakes’ accountability system can raise pupil achievement in general and not just in those areas under scrutiny.”

Well, I wrote in detail here about the OECD evidence on which Bew drew for this statement.

I do not think anyone reading that report in full could believe that it provides a ringing endorsement of an “English”-style accountability system. Consider, as I mentioned in that blog, the fact that that OECD report says: “Across school systems, there is no measurable relationship between [the] variable uses of assessment for accountability purposes and the performance of school systems.”

Moreover, although Bew says “the OECD has concluded that a ‘high stakes’ accountability system can raise pupil achievement”, with “high stakes” in quotation marks, in fact the phrase “high stakes” only occurs once in the main text of the 308-page OECD report which Bew references here, and its use does not back up the claim made here. (“High stakes” in the one instance referenced in this report refers to any qualification which is high stakes for a pupil, by which criterion the A-levels I took in the 1980s – which were low stakes for my school – would count but today’s Sats would not).

As I wrote in an article for the TES based on research for the NAHT, in fact there are many education systems which are not doing a demonstrably worse job than England and which do not have “high-stakes” accountability of the English kind.

If Bew’s claim is that this type of accountability “is important in driving up standards and pupils’ attainment and progress” is to be understood as meaning that it improves education in a more general sense than simply improving test scores, which must at least be considered if the quality of pupils’ learning is really what matters, then the report needs to consider more evidence.

Yet this section of the report, entitled “the impact of school accountability”, includes no studies raising concerns on the issue of test-driven schooling. It highlights only research which supports it.

This section then simply ends: “We believe the evidence that external school-level accountability drives up pupils’ attainment and progress is compelling.”

This is an absolute travesty of the evidential position. I would say that, given that I wrote my book on this subject from 2005 to 2007 seeking to put together all the evidence I could find on the effects of this system. Negative effects were not hard to come across: detailed concerns about the side-effects were coming to me naturally virtually every week around that time in my work at the Times Educational Supplement. To repeat, none of this evidence gets a mention in the section of the report where Bew is deciding whether or not high-stakes accountability is a good thing.

That is a shocking indictment of this final report. For all the evidence commented on in the interim report, it undermines any claim that this subject has been considered in a truly open-minded way.

If the evidence had been considered, weighed and a conclusion reached that the claimed advantages of hyper-accountability outweighed the claimed negatives (taken seriously and considered in detail); or if a conclusion had been reached that the current system, though imperfect should be retained because changing it in a fundamental way would present too many difficulties, well at least that would have been more honest. To try to claim that the evidence points entirely in this single direction is simply wrong.

Other inquiries to have raised deep concerns about test-driven schooling in recent years have been the Children, Schools and Families assessment investigation of 2007-8, plus its subsequent probe into the national curriculum; the Children’s Society’s Good Childhood Inquiry; and the exhaustive Cambridge Primary Review.  Sir Jim Rose, in conducting his own national curriculum inquiry for Labour which was barred from considering assessment, described it as the “elephant in the room”, in terms of the impact on the curriculum.

Consider some of the claims made in evidence to these various reviews.

The Mathematical Association told the select committee inquiry: “Coaching for the test, now occupying inflated teaching time and effort in almost all schools for which we have information at each Key Stage, is not constructive: short term ‘teaching how to’ is no substitute for long-term teaching of understanding and relationship within and beyond mathematics as part of a broad and balanced curriculum.”

The Cambridge Primary Review reported one witness to the review as, citing her experience as an English teacher, primary head and English examiner, as condemning “the ‘abject state of affairs’” where reading for pleasure in schools “has disappeared under the pressure to pass tests”.

The Independent Schools Council told the select committee’s curriculum inquiry: “National curriculum assessment should not entail excessive testing. Universally, a focus on testing was found to narrow children’s learning, teachers’ autonomy and children’s engagement in learning.”

Ofsted also told the select committee that “In some schools an emphasis on tests in English, mathematics and science limits the range of work in these subjects in particular year groups.” An Ofsted report on primary geography from January 2008, found that “pupils in many schools study little geography until the statutory tests are finished”, while an Ofsted report on music said “A major concern was the amount of time given to music. There were examples of music ceasing during Year 6 to provide more time for English and mathematics.”

The OECD itself said, in the education section of its report on the UK in March this year that: “Transparent and accurate benchmarking procedures are crucial for measuring student and school performance, but “high–stake” tests can produce perverse incentives. The extensive reliance on National Curriculum Tests and General Certificate of Secondary Education (GCSE) scores for evaluating the performance of students, schools and the school system raises several concerns. Evidence suggests that improvement in exam grades is out of line with independent indicators of performance, suggesting grade inflation could be a significant factor. Furthermore, the focus on test scores incentivises “teaching to tests” and strategic behaviour and could lead to negligence of non-cognitive skill formation”

 Either Bew has, then, defined “attainment and progress” in such a narrow sense – ie it means “there is compelling evidence that test-driven accountability drives up test scores” – that its claim to be interested in the learning of each child more generally cannot bear scrutiny (since it is only interested in the evidence of test scores).

Or improving “attainment and progress” is meant to stand for the quality of education as a whole improving as a result of “high-stakes” test-based accountability, in which case Bew has simply chosen to ignore that section of the research on this subject which conflicts with the way the review was framed by the government.

The report does, then, move on to “concerns over the school accountability system”, including “teaching to the test”. But it offers no detail of what the evidence says as to what this might mean for the pupil. The only substantial concern acknowledged here is the unfairness of the way results indicators are used for schools, which it says its recommendations will go on to tackle. This is an important argument, of course, but it is not the same as the claim, widely made, that the system of test-based accountability damages the learning experience of at least a proportion of pupils.

The only acknowledgement of this claim here is when the report says that many heads feel they “‘need’ to concentrate much of Year 6 teaching on preparation for National Curriculum Tests in order to prevent results dropping”. Bew then acknowledges that “the accountability system to date may appear to have encouraged this behaviour [my incredulous italics at the weakness of ‘may’, when heads face losing their jobs if results fall]”.

The report reacts by simply arguing that this need not happen: schools can get good results without narrowing the curriculum. That is exactly the conclusion of the last major report to look at this subject: the 2008 “expert group” report on assessment for Ed Balls as schools secretary.  That report suggested running a campaign to persuade teachers not to teach to the test, since there was simply no need.

Although teachers have argued with me that a good professional does not need to teach to the test, I’m afraid I think of this, when I read it in official reports, as the ostrich, or head-in-the-sand position. It is unscientific, I believe: the fact that some teachers and schools buck the trend does not negate the existence of the trend. The National Strategies, in the past have encouraged teaching to the test, so presumably they thought there was some value in it for schools, in terms of improving results. I suspect local authorities have also promoted a great focus on the content of the tests in schools where the data just has to improve. Overall, the incentives of the accountability system certainly push at least a proportion of schools towards test-driven teaching and thus, if one truly wanted to change this, it would be a good idea to look at changing the way accountability works, rather than effectively simply telling teachers not to follow what for many of them will be its logic.

Then the report closes down the debate, saying simply: “Given the importance of external school-level accountability, we believe publishing data and being transparent about school performance is the right approach.”

In other words, because the review team had already decided that the evidence of the beneficial effects of external accountability was “compelling” – ie without presenting any research on negative impacts – that was the end of the matter. There was no consideration of the actual impact on children’s learning during test preparation, and the nature of it.

Incidentally, because the review team believes that “high-stakes” accountability – ie making results high stakes for schools – works, it must then also believe that assessment should drive what goes on in schools, since the philosophy must be that making assessment results “high-stakes” for schools forces them to improve the quality of education they provide.  

The third problem of the report is related to this, and I don’t want to use too much space going into it in detail here. But in essence it runs as follows. Bew really ducks another criticism of test-based accountability: that test results are used for too many purposes, and that because of this, testing as currently constituted serves many of these purposes less than well.

I’ve put the second bit in italics, because Bew really doesn’t consider this implication. Essentially, Bew accepts the widespread claim that assessment data are put to very many purposes, but reacts to this mainly by listing the “principal” purposes to which they are already put, and then saying other uses should be considered as “secondary”.

It is, I suppose, at least an attempt to consider this issue. But the problem is that the purposes suggested as central by Bew include both that data should be used to hold schools to account, and to provide good information on the progress being made by individual pupils, for the benefit of those pupils and their parents. Bew’s claim, in the foreword, that test-based accountability should also support children’s learning should also be borne in mind here, for that must be another guiding principle if taken at face value.

The problem with the report is that arguably the argument at the heart of this debate is that the use of data to provide information on a school – and on teachers’ – performance can conflict with its use both to support pupils’ learning and to provide the best possible information on the quality of that learning.

This is a big part of what the many people who, Bew acknowledges, submitted evidence to the review mean when they say that the problem is not the tests, it is the league tables which are constructed on the back of them. Because teachers are worried about their school’s results, they take actions which, while right in terms of boosting results, may not be supporting the best learning experience for the child, or their long-term educational interests. And the very act of teachers directing so much attention at the tests and results indicators may also, paradoxically perhaps, make them less good measures of underlying education quality, an argument implicitly acknowledged in the report in a section where it says many secondary teachers do not trust KS2 Sats results because of the extent to which pupils have been prepared for the tests.

In other words, the purposes – and even these “principal” purposes – are in conflict. A report which took seriously the washback effects on learning, from the child’s point of view, of the accountability system, would look much more closely at each of these aims to try to ensure that the requirements of accountability do not conflict with the aim of providing the best possible education experience for pupils.

Some alternative proposals, not backed by Bew, have tried to look at re-engineering aspects of the system to stop some of the purposes conflicting in ways which look either harmful for pupils, or which give us less good data than we might want.

For example, the suggestion put forward by many that national education standards could better be monitored through a system of assessing a sample of pupils rather than through testing every child comes because the purposes to which the current testing system is put are felt to be in conflict. A sampling system, with a relatively small number of pupils being assessed and each on differing parts of the curriculum, would allow information to be collected, potentially, across a much wider and deeper spread of aspects of the curriculum than is possible through a system where all pupils must take every test produced. And its information on whether standards were improving or falling would be more robust because, as the results would be “low-stakes” for schools, test questions could be retained from year to year to allow direct comparisons of pupil performance to be made.

These kind of improvements on the quality of information provided are not possible in the current system because other purposes to which current national test data is put – to provide information on individual schools and on all pupils’ performance, meaning that every pupil must be tested, and papers must change from year to year to guard against schools “cheating” – make them unfeasible.

A more serious look at this subject would also have considered in detail the problems of seeking simultaneously to use test results as “objective” measures of pupil performance;  to support learning; and also to hold schools to account. In 2006, a proposal put forward  by Cambridge Assessment and the Institute for Public Policy Research acknowledged the problem that the purposes were in conflict: the need for schools to generate good results could lead to test-driven teaching and a narrowed curriculum, which was not an ideal form of learning. It therefore proposed a change whereby teacher assessment would become the main judgement on both pupils’ and schools’ performance, but then children in each school assessed through a “testlet”, measuring for each child just a small area of the curriculum. The testlet results would be used as an assurance that the accountability function now placed on teacher assessment was not leading schools to inflate their results. In other words, it retained accountability but, in trying to change the relationship with tests in a small number of subjects, attempted to stop it conflicting with the goal of supporting good learning. This idea was not considered in detail by the report.*

Another alternative, mentioned as my favourite in my book, would be to make inspection judgements the central focus of school-by-school accountability (with inspections offering a rounded look at the quality of education provided, to guard against curriculum narrowing), and to run sample tests to help provide national education quality information.

Instead of trying to look at the relationship between the purposes, Bew has simply left the mechanics of the system in place, in that assessment data is still to be used for all the main purposes it is now including : holding schools to account, producing data on individual pupils’ performance for the benefit of them and their parents, and generating national and regional achievement data.

The report says that through its proposals “we believe we can address the imbalances and perverse incentives in the school accountability system”.

Because the review has not addressed the issue of the conflict of purposes this idea of countering perverse incentives is, I think, a forlorn hope. Its proposals represent no significant change to the system’s fundamentals, but rather a restating of the basis of the system – (which the report must implicitly believe, in its essentials, to be a good thing) – and then an attempt to manage the detail.

Ok, so now, finally to turn to the concrete stuff in terms of those detailed changes recommended by the report, some of which, I think, are important.

-          The report proposes moving to a system of publishing schools’ results averaged over a three year period, to address concerns that judging institutions on single years is unfair, given the way pupil cohorts can change. Small schools, where the introduction of a few high- or low-achieving pupils can have a proportionally very large effect on results from year to year are particularly hard hit by the current system, and their concerns would seem to have influenced this change. However, three-year averages are not recommended to replace single year statistics, but to sit alongside them in league tables. A key consideration could be what weight they are given elsewhere in the accountability regime, including Ofsted reports and floor targets; the report does not, I think, stipulate that they should be given priority.

-          Additional measures are to be introduced recording schools’ achievements counting only those pupils who completed the whole of years 5 and 6 at the school, in response to concerns that schools with lots of children arriving from elsewhere feel an effect on their results. Again, it seems these results will be published alongside the existing measures, rather than replacing them.

-          The report talks about placing a greater emphasis on progress measures, alongside “raw” attainment. However, progress measures already feature in league tables, are central to Ofsted’s new systems and are included in the government’s new floor targets for primaries. So call me a cynic but it is hard to see that the report has added much here. (Overall, my hunch is that there is very little in the report as a whole with which the government would disagree – and you have to wonder after reading this report if this was always likely to be the outcome – but one test (pardon the pun) of that will have to await ministers’ reaction to the report).

-          Teachers will submit teacher assessment judgements before pupils’ test results are known. This seems sensible to me, as it negates the risk of the test judgement influencing the teacher assessment verdict. As the report correctly states, they are measuring different things, so the judgements reached through each assessment method should be kept separate.

-          Finally, the most significant change relates to writing. Bew proposes, first, the introduction of a new test of spelling, punctuation, grammar and vocabulary. I guess teachers will have views on that; I would not comment except to say that the comment in the report that these aspects of English have “right” and “wrong” answers was something some people were querying last week.

The recommendation, however, to replace the writing test with teacher assessment is substantial. It has always seemed to me strange, as someone who went through secondary and university assessments in the 1980s and 1990s and was never assessed on creative writing in the exam hall, it has always seemed to me to be strange that 11-year-olds were asked to be creative under the time pressure of Sats. I think a move to teacher assessment, then, would undoubtedly be a good thing. It could be argued that this change alone, in promoting a better assessment experience for many children, will mean the Bew review will have been worthwhile, despite some of its more fundamental findings being so flawed.

The report does, however, mention that the teacher assessment results are to be subject to external moderation. This is unavoidable, in a system which is using the scores generated to hold schools to account. Ministers, I am guessing, will want to ensure that the moderation is robust, as clearly there will be an incentive for schools to push up scores if they were under pressure over pupil achievement through, for example, the floor standards. The great danger, again, would be that the government decided that the need to use the results to judge schools is seen to be more important than providing the right assessment experience for pupils – that conflict of purposes again –  and therefore moved not to accept this recommendation to move towards teacher assessment. I have, though, no evidence that this is going to happen and hope it will find favour.

Summing up, Bew’s detailed changes do stand to make some difference. But I would suggest that the arguments over the system’s underlying dysfunctionality – or not – are not going to go away. It is a shame that this report did not take more seriously, in reaching its verdict in this final report, the detail and nature of some of the concerns.

*The report does briefly the merits of using tests to moderate a mainly teacher assessment system, concluding that this would not be feasible as tests and teacher assessment are not the same and thus, I think is the implication, it would be wrong to view the test as providing “true” validation of each teacher assessment verdict. I would not disagree with that as an argument, but I do not think it invalidates the Cambridge Assessment/IPPR model, since the “testlets” in this case are not meant to provide a judgement on the accuracy of teacher assessment in the case of every pupil, but merely to provide a more general check that a school has not inflated its teacher assessment judgements.

- Warwick Mansell

1 Comment
posted on July 3rd, 2011