Publication of the results of the 2014 Research Excellence Framework evaluation of the quality of work undertaken in all UK universities last December attracted much attention, as league tables of university and department standings were constructed and estimates of the financial consequences of the achieved grades were assessed. Soon after that, a book was published savagely criticising the peer reviews undertaken at the heart of not only that exercise but also the mock exercises as they prepared their submissions. Ron Johnston reviews Derek Sayer’s critique.
Rank Hypocrisies: the Insult of the REF. Derek Sayer. SAGE Publications, 2015.
The UK University Research Assessment Exercises (RAEs), replaced by the Research Excellence Framework (REF) in 2014, have had many fundamental impacts on academic life there since the first exercise in 1986. Over subsequent decades their intensity has increased very substantially and universities invest a great deal into preparing their submissions.
The 2014 REF required from each unit being submitted (usually a university department): copies of each of the included staff members’ (in most cases) four ‘best’ research publications during the previous six years; a specified number of ‘impact’ case studies; and a statement regarding the departmental research environment. All were rated on a scale from 1*-4*and each department’s work summarised according to the percentage getting each grade. Only work graded either 3* or 4* would be rewarded in the allocation of the multi-million pound Quality Research (QR) funds in subsequent years.
Derek Sayer’s main focus in Rank Hypocrisies is on the largest of these elements – the ranking of individual scholars’ research outputs, which counts for 65% of the overall evaluation. His first chapters – ‘Benchmarks’ and ‘Potemkin’s Village’ – usefully set out the basic history and practices of the RAEs/REF. But the book was written – with great clarity and obvious intense anger – to promote debate over his concern regarding the REF’s impact on British academic life in general and on many individual academics’ lives and careers in particular.
That impact not only reflects the all-embracing evaluation culture, and the consequent league tabling, that the REF represents but also the gaming that occurs in universities seeking to maximise their returns – both financial and charismatic. They have a balancing act to play: do they only enter those academics whose work is likely to be graded either 4* or 3*, and so come high up the league tables, or should they enter a wider range, including work that may only be rated as 2*, in order to get more money whose allocation is based on the average grade multiplied by the number of staff entered? The practices employed at Sayer’s home university – Lancaster – illustrate those games.
The Ladder of Divine Ascent Wikipedia Public Domain
‘Benchmarks’ contrasts conduct of the REF with other decisions about the quality of academic work – journal refereeing and assessment of promotion/tenure cases. In the former, editors send papers to specialised experts for judgements; in the latter, a candidate’s full portfolio is sent to one or more senior scholars from among those best placed to provide an overall judgement. This is peer review as it should be practised – but was not in REF2014. Its assessments involved relatively small panels, compared to the breadth and volume of work they had to evaluate – most of which, in Sayer’s (undoubtedly correct) view, they were insufficiently knowledgeable about to reach a fully-informed judgement, albeit against the vague criteria defining the four grades: work regarded as ‘world leading’, ‘internationally excellent’, ‘recognized internationally in terms of originality, significance and rigour’, or only recognized ‘nationally’ respectively. His critique does not belittle the panel members’ efforts (interest to be declared here: I was a panel member in the 1989 and 1992 exercises), but rather demonstrates that the procedures were not consistent with proper peer review. (Some panel members and other readers had to assess hundreds of publications, including books, in a relatively short period – and yet many academics decried the proposed use of metrics and insisted on peer review.) Sayer’s conclusion that ‘the chances of outputs in history being read by panellists who are experts in an author’s field are very unevenly distributed’ (p.43) undoubtedly applies widely across disciplines; he clearly would not accept a bowdlerisation of a US Supreme Court Justice’s well-known saying that REF panellists ‘may not be able to define a 4* journal article but they know one when they see one’! The judgements and all that flows from them are thus potentially flawed.
Most universities instituted ‘mock REF’ exercises to decide which staff and outputs should be entered and HEFCE issued a Code of Practice requiring that they be characterised by transparency, consistency, accountability and inclusivity. Sayer’s case study of the Department of History at the University of Lancaster, plus occasional references to events and practices elsewhere, in a chapter entitled ‘Down and Dirty’, finds these exercises even further from the proper practice of peer review than the REF panels’. Many decisions on whether a published item, or even an individual academic’s full portfolio (‘best four’), was submitted were made on the advice of a single individual invited to consider an entire departmental submission strategy and its contents: he/she may have been no better placed to assess much of that material than an informed amateur, so those ‘expert assessments … are anything but trustworthy’ (p.78).
This has wide implications. HEFCE was to publish the names of staff members submitted, from which those excluded could be readily deduced. Individuals’ careers could be permanently damaged by such uninformed judgements, long after the ‘perpetual climate of anxiety’ (p.78) during the preparation period, especially if information on how and why decisions were made was not disseminated, and appeals procedures were inadequate – which Sayer claims was the case at Lancaster and elsewhere.
In the concluding chapter – ‘The Abject Academy’ – Sayer, perhaps surprisingly, suggests that metrics could do as good a job as flawed peer review (the QR allocations would very largely go to the same institutions), saving much time and money that could be spent on research itself rather than paying expert assessors and the legions of administrators appointed to prepare for the REF (most universities have already initiated preparations for the expected 2020 exercise), some of whom have never done any themselves. Universities, of course, are caught in a prisoner’s dilemma. They have little alternative but to play the REF game if they want as big a share as possible of the QR money; if they opt out, or play badly, the only beneficiaries are their opponents/competitors.
Rank Hypocrisies is not the last word on the 2014 REF, but it is a partial, powerful, well-researched polemic (Sayer apologises for its ‘dry and sometimes legalistic style’ – p.viii) uncovering the poverty of academic decision-making at the heart of such flawed exercises. No hostages are taken, no named individuals spared criticism where it is considered due – all undertaken in defence of academic standards, academic freedom and the rights of individuals that their employers exercise a duty of care.
The exercises have got ‘too big for their boots’ and created havoc. But will anything change? Will those promoting metrics prevail for the 2020 exercise – and what game-playing will that stimulate? Or will those who oversee and fund universities just continue allocating large sums of moneys on the basis of well-meaning but flawed assessments of what we write – in which individual academics feel obliged to participate, to ensure they are done as well as possible?
Ron Johnston is a professor in the School of Geographical Sciences at the University of Bristol and formerly a Pro-Vice Chancellor at the University of Sheffield and Vice-Chancellor of the University of Essex.