Rhetoric over substance? Impact narratives are less problematic than we think

Despite being emulated in similar exercises around the world, the UK’s Research Excellence Framework (REF) had been widely criticised in the UK. One sticking point is the use of narrative impact case studies to assess research impact, worth 25% of the overall exercise. Bella Reichard draws on textual analyses to assess whether style trumps substance in these accounts.

The case study peer review model has a lot in its favour. Notably, it gives the much-needed flexibility to add context and enable a fair assessment of different kinds of impact. Still, some researchers point to the potential biases of this approach, arguing that well-crafted case studies score better than equivalent poorly written case studies. However, for presentation to compromise scoring, there needs to be a difference in presentation between scoring brackets in the first place.

Differences between high- and low-scoring impact case studies could be suspected in three areas:

content – different impact, significance and reach;
language – for example, grammar or word choice;
other aspects of presentation – from emphasis on different types of content to formatting.

Differences in content are what we expect to be assessed. Differences in language, or in the way that narratives are crafted, should have no influence on the assessment. If there is significant difference, those arguing that presentation took precedence over content may be right.

A superficial reading of earlier work on high-scoring and low-scoring case studies from REF2014 could lead to this conclusion. Although, in this paper, my colleagues and I emphasised the characteristics of high-scoring case studies in order to level the playing field and avoid “false negatives” in the 2021 REF assessment. The different types of analysis in my PhD research indicate that the measurable difference between high- and low-scoring case studies does not suggest a substantial influence of presentation on scoring in REF2014. There were of course differences in content between high- and low-scoring case studies, and it is important to distinguish between these factors, which were likely to have an effect on scores, and factors related to presentation, which were less likely to influence scores.

1. There are differences in content between high- and low-scoring case studies

84% of high-scoring cases articulated benefits to specific groups and provided evidence of their significance and reach, compared to 32% of low-scoring cases. The low-scoring case studies typically focused instead on the pathway to impact, describing dissemination of research findings and engagement with stakeholders and publics without citing the benefits arising from dissemination or engagement. This finding is based on a collaborative thematic analysis of 85 high- and 90 low-scoring case studies across Main Panels.

Most texts included material [research, pathway and impac], but seven of the low-scoring case studies did not include any impact claims in the summary.

For a sub-sample of 76 case studies, I classified all the material in Section 1 (“Summary of the impact”) as being related to either research, impact or pathway. Most texts included material on all three, but seven of the low-scoring case studies did not include any impact claims in the summary. Both findings could be a problem of presentation, if the impact was there but was not articulated; or they could be a problem of content, if the impact (by REF-definitions) did not exist. While it is of course impossible to conjure up additional impact content, hopefully this work helped highlight this presentational pitfall in the run up to REF 2021.

2. There is a marginal difference in readability

High-scoring case studies were easier to read based on common readability measures, but only marginally. Compared with both general English texts and research articles, the mean readability scores of high- and low-scoring case studies (n= 124 and 93 respectively) are very close to each other. Moreover, the tool that we used gives eight different measures for cohesion, and six of these are not significantly different between high- and low-scoring case studies. The two where a difference can be found are causal links (e.g. “because”, “so”) and logical connectivity (e.g. “and”, “but”). High-scoring case studies were better connected. The difference is significant but with a moderate effect size. So, looking at the overall difference that there could have been, the similarities are far stronger.

3. There is a marginal difference in evaluative language.

Focusing on the type of evaluative language (used to signal the scale and quality of claimed impacts) in Section 1 using the Appraisal Framework, I tagged the 76 texts in the sub-sample for 47 different features. There were measurable differences in only five of these and 42 features were used fairly evenly across scoring brackets. Again, this shows that with the difference that there could have been, the measurable difference is probably not enough to have influenced scoring. Moreover, the features where a difference was statistically significant were related either to content (where the writer has no influence), or to the level of specificity. High-scoring case studies used more specific details about location or timelines than low-scoring ones. This may have cost some high-impact case studies a better score, but it is unlikely to have inflated the scores of less-deserving impacts.

The differences in language between high- and low-scoring case studies in REF2014 can mostly be explained by the fact that the genre was new to everyone, without the database of example texts that were available in 2021. These differences were fairly straightforward to bring into the open, and many could hopefully be avoided in later iterations. Others are also symptoms of the “content” they represent; for example, if a case study reports on research and dissemination activities but not impact, the lack of impact that can be assessed is also reflected in the lack of impact-related material in the text. In this case, a low score is fair (in the REF framework) and not based on assessor bias.

the statistically significant differences between high- and low-scoring case studies are not enough to assume that language choices had an undue influence on scores.

Overall, compared to the number of indicators where there could have been differences, the statistically significant differences between high- and low-scoring case studies are not enough to assume that language choices had an undue influence on scores. This assumption is usually based on self-reports by assessors, either directly (McKenna) or based on observations or interview data (Manville and Watermeyer and Hedgecoe). Analysis of textual data adds further evidence to this picture and allows decision makers to have more confidence in the integrity of the exercise.

This post summarises findings from the author’s PhD research at Newcastle University (ongoing), and paper Writing impact case studies: a comparative study of high-scoring and low-scoring case studies from REF2014 published in Humanities and Social Science Communications. Bella Reichard also writes for her own blog. She thanks Mark Reed, who made helpful suggestions for writing this post and on whose website a version of this post first appeared.

The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.

Image Credit: Demosthenes on the seashore, Eugène Delacroix, National Gallery of Ireland (CC BY 4.0).

1 Comments

Andrew Wray says:

February 21, 2024 at 6:22 pm

Will you repeat the analysis for REF2021 case studies?
The lack of significant stylistic differences in REF2014 material is encouraging, but inside universities we are already working on our REF2029 pipeline! Not writing yet, but starting the narrow the field and think about evidence.
Would be good to hear your findings confirmed on the more recent data.

Bella Reichard

February 21st, 2024

Rhetoric over substance? Impact narratives are less problematic than we think

Bella Reichard

February 21st, 2024

Rhetoric over substance? Impact narratives are less problematic than we think

1. There are differences in content between high- and low-scoring case studies

2. There is a marginal difference in readability

3. There is a marginal difference in evaluative language.

About the author

Bella Reichard

1 Comments

Leave a Comment Cancel reply

Can artificial intelligence assess the quality of academic journal articles in the next REF?

January 16th, 2023

Research assessments tell us what and who did research impact, but say little about the why and how.

September 7th, 2022

Impact Monoculture – Are all impact case studies the same old story?

November 9th, 2021

Causality and complexity in impact statements: Is it time to rethink a one-size-fits-all approach to writing about impact?

April 19th, 2021

Bella Reichard

February 21st, 2024

Rhetoric over substance? Impact narratives are less problematic than we think

Bella Reichard

February 21st, 2024

Rhetoric over substance? Impact narratives are less problematic than we think

1. There are differences in content between high- and low-scoring case studies

2. There is a marginal difference in readability

3. There is a marginal difference in evaluative language.

About the author

Bella Reichard

1 Comments

Leave a Comment Cancel reply

Related Posts

Can artificial intelligence assess the quality of academic journal articles in the next REF?

January 16th, 2023

Research assessments tell us what and who did research impact, but say little about the why and how.

September 7th, 2022

Impact Monoculture – Are all impact case studies the same old story?

November 9th, 2021

Causality and complexity in impact statements: Is it time to rethink a one-size-fits-all approach to writing about impact?

April 19th, 2021