Criticisms have been made of recent influential studies that show improving performance in hospitals operating in more competitive environments compared with hospitals that have a local monopoly on care. Zack Cooper, Steve Gibbons, Simon Jones and Alistair McGuire set the record straight. The claims by Pollock et al are based either on distortions of the original research, or on an apparent lack of understanding of modern economic analysis.
In a recent blog post, Allyson Pollock, Alison Macfarlane and Ian Greener misrepresented our work and built straw man arguments designed to undermine our findings. This might pass as political discourse, but is it academic debate?
Perhaps these objections arose because of the publicity that our work has received, rather than the detail of the research. But our goal as academics is to produce the most rigorous research possible, with the further goal of promoting evidence-based policies. When we produce evidence that can have an impact on policy, we present this work and make it available to policy-makers, press and the public. Researchers in publicly funded universities are expected to publicise their work in a timely manner to policy makers. How else should good policy be formed or research funding justified?
Pollock and her co-authors decried ‘The drip feed of pro-competition studies’, we have produced. In fact, there are just two studies. Our first study in the Economic Journal (EJ) looked at the impact of competition – by which we mean a move to less monopolistic local markets – on quality. This work illustrated that competition between NHS providers in a market with fixed prices led to better outcomes. Our findings were consistent with what economic theory would predict and they mirror precisely the academic literature from empirical research in the US. More than that, since our research came out, two subsequent studies by separate research teams (Gaynor et al and Bloom et al) have found nearly identical conclusions about the positive impact that fixed price competition has had in the NHS.
Our second paper looked at the impact of this competition on patients’ length of stay (and was an expansion of an earlier paper). It showed that NHS providers in competitive environments shortened their pre-surgical and overall length of hospital stays (which we regard as evidence of improvements in efficiency). In contrast, the net effect of the introduction of private providers into the market was to increase the average length of stay in NHS hospitals and is potentially suggestive of cream-skimming. This latter finding is not overtly pro-competitive. These studies provide precisely the kind of evidence that policy makers look for, so that they can learn about what has worked and not worked in the past, in order to chart a sensible path forward. This is why they have had a significant impact. Of course, wide-reaching policy should not be set on the basis of one study. However, as a body of evidence grows, the case for policy action becomes more persuasive.
What about the counter-arguments? So far, critics have not articulated a theory as to how fixed price competition could undermine quality in the NHS. They have presented no evidence of their own that competition has harmed patient outcomes.
To be fair, Professor Greener has done work in this area. For example, in 2009, he published an article in a journal called Public Money and Management titled, “Patient Choice in the NHS: What is the Effect of Choice Policies on Patients and Relationships in Health Economies”. This ethnographic study presents insights into the attitudes of hospital managers and staff in one NHS trust. But is it ‘good science’ in contrast to our work which he and Pollock call ‘bad science’ and which they have criticized in the Lancet?
Greener’s piece drew on 60 semi-structured interviews of NHS staff at a single NHS hospital; no patients were interviewed during the course of the research. From these interviews, he concludes that: ‘The case presented suggests that patient choice policies fall short on all of the conditions that are necessary for them to work. Patients in the case study were reluctant to exercise choice decisions’. Here, Greener is happy to use a qualitative style of research (interviews at a single hospital) to draw conclusions of his own against a national policy.
However, when other researchers use qualitative research together with quantitative evidence to show that competition can have positive effects, there is less tolerance. In their co-authored Lancet comment piece attacking our research competition, Pollock et al. dismissed work by Nick Bloom, Carol Propper, John Van Reenen and Stephen Seiler, stating disparagingly that in their study: “An association with management quality is based on interviews with 161 senior staff that did not take account of relevant causal factors’. Bloom et al. involved interviews at 100 hospitals and integrated quantitative work with advanced econometrics. If this is ‘bad science’, what are we to make of the critic’s own work, which adopts a related approach?
Henry Overman has kicked off a good conversation on this blog about what constitutes sensible blogging – we hope this discussion continues. Hopefully such a debate, plus our reply here, will provide a teachable moment to pause and reflect on how academics discuss evidence, consider the casual use of phrases like ‘bad science’ and begin a thoughtful discussion of the role of blogs in academic and policy debates in the social sciences.
In the Annex below we give a further point-by-point rebuttal to the criticisms of our work. It is worth noting that Professor Pollock has raised these points before and we responded to her points twice, both in a Lancet letter, and in freely accessible online 8-page document (a detailed response that Pollock et al. do not mention) posted online, also included as a linked appendix to our Lancet reply.
Please read our comments policy before posting.
Zack Cooper is a health economist at LSE Health at the London School of Economics. Zack’s work focuses on assessing the impact of competition in hospital and insurance markets and analyzing the effect of financial incentives and payment reforms on health care delivery.
Steve Gibbons is a Reader in Economic Geography, Department of Geography and Environment at the London School of Economics. He is also Research Director of the ESRC/BIS/WAG funded Spatial Economics Research Centre.
Simon Jones has a joint appointment as a Visiting Professor within the Department of Health Care and Policy at the University of Surrey and as Chief Statistician at Dr Foster Intelligence.
Alistair McGuire is Professor of Health Economics in the Department of Social Policy at the London School of Economics
ANNEX: DETAILED RESPONSES TO POLLOCK ET AL’S CLAIMS
Claim 1: ‘The major improvements in outcomes after acute myocardial infarction can be attributed to improvements in primary prevention…’
In their third paragraph Pollock et al argue that observable reductions in AMI death rates were attributable to improvements in AMI care itself, including ‘primary prevention in general practice and in hospital care, including the introduction of percutaneous IV angiography’. The implication is that competition does not explain the wide spread decrease in AMI mortality during this period.
Our response: Our research does not examine the causes of widespread reductions in AMI mortality during this period; and it does not claim that competition was the main factor leading to falling AMI mortality. Instead, our study is investigating whether AMI mortality fell more quickly in more competitive areas after choice was introduced in 2006 (set against a background of these larger and more general changes). We illustrate that there was no difference in the rate of mortality reduction in more competitive versus less competitive areas prior to 2006, before competition was introduced. We then illustrate that mortality fell more quickly in more competitive areas after patients in England had choice and hospitals in England had to compete from 2006 onwards.
We state explicitly in our EJ paper on page 243 that: ‘Hospital quality, measured by 30-day AMI in-hospital mortality, improved consistently from 2002 to 2008 [the period pre-dating widespread patient choice], as shown in Table 2. Likewise, the number of AMIs treated per year also fell. This reduction in mortality and reduction in overall AMI occurrences is consistent with international trends and is driven, in part, by increasing adoption of new technology in the treatment of AMI and improvements in public health…’.
Our main finding is that prior to 2006, there was no difference between the rate of mortality reduction in more or less competitive areas, but from 2006 onwards, after competition was introduced, more competitive areas reduced their AMI rates more quickly. Hence, we say: ‘Ultimately, we find that after the introduction of these reforms in 2006, our marker for service quality (AMI mortality) improved more quickly for patients living in more competitive hospital markets’ (page 229).
Claim 2: ‘Not all patients were aware that they could choose and that AMI was an emergency procedure’.
Here, Pollock et al make two claims related to the impact of patient choice on AMI outcomes. First, they claim we simply assume that choice occurred, and that since only 49 per cent of patients recalled being offered a choice, this was not enough to create substantially strong incentives for providers to improve. Secondly, they cite a quote by Roger Boyle saying: ‘It is bizarre to choose a condition where choice by consumer can have virtually no effect’.
Our response: A 2010 King’s Fund Report by Dixon et al ‘found that just under half (49 per cent) of patients recalled being offered a choice’. Yet this fact does little to suggest an absence of financial incentives for hospitals. Consistent with the wider industrial organization literature, we also believe that if 49% of a hospitals’ patients were offered choice, this creates sufficiently large financial incentives for hospitals to improve quality. More than that, the threat of patient choice alone is also likely to create an incentive for providers to improve their performance and by a wide majority, NHS patients are in favor of having choice over where they receive care. Evidence from the British Social Attitudes Survey shows that when asked, 75% of the British public say they want the ability to select their hospital (Appleby and Phillips, 2009). On page 233 of our EJ paper, we outlined the mechanism through which competition should improve hospital quality, including highlighting the role that GPs (as opposed to patients themselves) played in these reforms.
On the critics’ second point, that AMI mortality may not be impacted by competition in the elective market, this is something we address in multiple places throughout the paper. On page 234, we write: ‘While providers are not explicitly competing for AMI patients because competition in the NHS is limited to the market for elective care, we expect the market-based reforms to result in across-the-board improvements in hospital performance, which in turn will result in lower AMI death rates. To that end, Bloom et al. (2010) looked at NHS hospitals and found that better managed hospitals had significantly lower AMI mortality and that greater hospital competition was associated with better hospital management’. In addition, in Appendix A, we pen a mathematical explanation of this for more technical readers.
Claim 3: ‘Crucially, even if patient choice had occurred it does not explain why heart attack mortality rates fell. There is no biological mechanism to explain why having a choice of providers for elective hip and knee operations surgery …could affect the overall outcomes from acute myocardial infarction’.
Here, the claim is that there is no biological mechanism through which giving patients a choice and increasing incentives for hospital quality could lower heart attack death rates.
Our response: It is surely clear from reading the paper that the theoretical mechanism we consider is that the extension of choice created financial incentives for hospitals (and their staff) to improve the quality of their services, and that improvements in one area of service will have an impact on quality in other areas. In short, competition improved care quality, and improved care quality lowered death rates.
Claim 4: ‘they seem unaware that lengths of stay differ between the conditions they examine’.
Here, Pollock et al note (rightly) that there are different lengths of stay between hip replacements, knee replacements, arthroscopies and hernia repairs. They argue (wrongly) that this was not controlled for in our most recent work and that this could be biasing our results.
Our response: As we note on page 12 of our paper, and on every table that includes regression results in our analysis, we include ‘fixed effects’ for procedures, which control for different length of stays between procedures (i.e. within procedure differences over time). Our results are robust with and without these controls, and are similar for each of the procedures individually. We choose to pool the data and to include fixed effects to allow for a neater presentation of our results.
Claim 5: Correlation versus causation
Pollock et al argue that ‘the authors make the cardinal error of not understanding their data and of confusing minor statistical associations with causation’.
Our response: In line with much modern applied work in economics, our methods work to mimic the conditions of a controlled experiment, given that such explicit experimentation is infeasible in real world policy analysis. Articles like Ibens and Wooldridge (2009) and Angrist and Pischke (2010) give a sense of how the methods we use are aimed at measuring causation, for anyone interested who is not familiar with these ideas.
These articles discuss empirical techniques used in statistical analysis that are designed to assess the causal relationships between policies and outcomes. Both discuss difference-in-difference regression, which is the technique used in our research papers. This statistical technique is designed to identify causation, and it is frequently used in policy analysis and program evaluation. Of course, no empirical analysis – other than, perhaps an ideally designed large-scale random control trial – can ever unequivocally disentangle correlation from causality. But researchers have to work with what methods and data are available in order to try to say something useful about causality.
Difference-in-difference (DiD) regression is a real world application of the same statistical techniques/theory used in randomized control trials (RCTs). The key assumptions underpinning our DiD regression are that:
(i) There were no differences in acute myocardial infection (AMI) rates or in length of stay trends in our treatment group of hospitals (operating in less concentrated markets) and the control group of hospitals (operating in a monopolistic market). And
(ii) The policy (in our case the introduction of patient choice into these markets) was not introduced at exactly the same time as other policies that would have also improved outcomes in the treatment group of hospitals relative to the control group.
These are assumptions we tested directly in the paper. We illustrate that there were no pre-reform differences on performance for hospitals in competitive versus non-competitive markets. Likewise, we measure market structure in 13 separate ways and show that the results are similar in each approach, despite the fact that our measures of market structure are not heavily correlated with each other. In addition, we show that our results were not driven purely by urbanization, by testing directly whether urban areas did better after 2006 (they did not). DiD regression has a long history in policy analysis. Indeed, it was first used in a seminal public health study by Snow (1855) looking at the causes of cholera in London.
Claim 6: ‘Deaths from acute MI are not a measure of quality of hospital care, rather a measure of access to and quality of cardiology care’.
Here, the claim is that AMI mortality is not a measure of hospital quality and that it is a measure instead of access to the hospital and specifically of cardiology services.
Our response: Again, this ignores evidence presented in our EJ paper where we outline our rational for using AMI as a quality indicator on page 237, and on the next page show that hospitals with higher AMI death rates also have higher overall death rates, higher length of stay for elective hip and knee replacements, and higher waiting times. Bloom et al show that better managed hospitals have lower AMI mortality and higher patient satisfaction. This is why AMI mortality has been used as a measure of hospital quality in a number of studies, including Kessler and McClellan (2000), Volpp et al., (2003), Propper et al. (2004), Kessler and Geppert (2005) and Gaynor et al. (2010). These are not simply economics articles written by empiricists unversed in medicine: Mark McClellan and Kevin Volpp are both trained medical doctors (in addition to being trained economists).
Claim 7: ‘Equally, the authors do not look at how clinical coding changed following the introduction of the tariff in 2006’.
Here Pollock et al imply that changing coding practices could be producing the results that we observe and that this is something we ignore.
Our response: Our methodology allowed for multiple coding of the same procedure. In our paper, we have included at least three separate OPCS 4 codes for each procedure, specifically to allow for the possibility that Pollock et al raised, which is that the same procedure was coded differently over time. We bundled all procedures into one master heading – i.e. hip replacements – and we did not look at procedures defined using a single code.
Criticism 8: ‘Finally, length of stay is also a product of a range of factors related to the conditions in their data; pre-operative work for hip and knee replacement needs to take account of rurality and patient fitness for discharge, especially if patients live alone and have other co-morbidities and complexities’.
The critics say here that it is imperative to control for the ‘rurality’ of patients and their underlying co-morbidities.
Our response: Our paper controls for these factors. We initially introduced controls for co-morbidities using the Charlson index of co-morbidity. However, when they did not alter our main estimates, we chose not to include them since the counting of co-morbidities during this period was potentially subject to up-coding. We control for socio-economic status using the income vector of the 2004 Index of Multiple Deprivations and additionally control for patients’ gender and age. ‘Rurality’ is controlled for by including a measure of the distance from the patients’ registered GP to the hospital where they received care and by using hospital fixed effects. It is also controlled for in the way we design many of our market structure indices, which take into account actual GP referral patterns, travel times, and population density.
Claim 9: ‘they ignore the political context in which the data was generated’
The claim here seems to be that the Health Episodes and Statistics Data is not accurate and that the ‘political context’ in which it was collected undermines its accuracy.
Our response: For any such ‘manipulation’ to affect our results, there would need to be systemic manipulation of the data such that hospitals located in competitive areas deliberately reduced their recorded length of stay data by manipulating their admission and discharge dates. Furthermore, they would have needed to do so in such a way that it happened to coincide precisely with the introduction of choice and was correlated with the three measures of market structure we used in our efficiency paper and the 13 measures we used in our EJ paper.
Contributors to this debate may be interested in a piece published in the BMJ today by David Hunter and Gareth Williams. It is behind a paywall I’m afraid, but makes a range of interesting objections to the Cooper work http://www.bmj.com/content/344/bmj.e2014.full
Some thoughts on the various issues raised in the blogs by Allyson et al, Julian and Zac…and the various replies…..all about the ‘Zak phenomenon’.
(1) I agree with Ian (Greener) that the recent book (Mays et al) on evaluating New Labour’s ‘market reforms’ does not provide evidence that ‘competition has worked’. It may provide evidence that the NHS has not faced Armageddon as a result of New Labour’s neo-liberal turn, but that may be because the latter was hot on rhetoric and shambolic in practice ie there was not ‘a market’ of any consistent sort (and certainly not one which obeyed the ‘conditions for a successful quasi-market’ as set out by Julian et al from the early 1990s onwards….and CERTAINLY not one which did not exist before 2006 yet which appeared like a silver bullet across England’s green and pleasant land after 2006 (the ‘policy off, policy on’ assumption made by Cooper et al in their 2011 ‘mortality rates’ study)
See my review of Mays et al in International Journal of Health Planning and Management, Vol 26 No 4, 2011 – not for an analytical elucidation of this but for a pointer to the complexity and doubt.
(2) I agree with Adam (Oliver) that the Cooper et al studies (2011, above; plus the recent one on lengths-of-stay) cannot EASILY be picked apart conclusively. I also agree with him that that – in my words – these studies are being drastically over-interpreted by the academic community and (more worryingly) by the ‘London consensus’ on health reform (which embraces the PM, Andrew Lansley, the Nuffield Trust et al). If academics were playing for the low stakes that Henry Kissinger once remarked upon (to paraphrase: academic disputes are so vicious because the stakes are so low!), it would not matter so much….but at times like these, when the Health and Social Care Bill has been such a self-imposed headache for the Coalition, the likes of Cameron are desperate to pull ‘research evidence’ out of the ‘garbage can’ (Cohen, March and Olsen 1972) and justify policy already made for ideological reasons (whether overtly ideological or, following Keynes, the defunct economists’ dogma favoured by self-styled ‘practical men’.
Let me explain: Cooper et al attempt to ‘control for’ lots of the things likely to be used by their ‘enemies’ as proof of their methodological shortcomings. And their graphical depiction of the pre-2006 and post-2006 trends (on SMRs and lengths-of-stayie in both studies) suggests that ‘something is going on.’
But we really cannot possibly know what, yet if ever. It is not just a question of needing qualitative research to know what was going on in the ‘black box’ – true as that is (and I agree with some comments – how on earth can concern for the reputation of one’s Trust in an era of competition ‘motivate’ clinical teams dealing with AMI…..it’s just not the way the NHS world works.)
It is more: there was no ‘policy off, policy on’ moment in 2006 as regards ‘choice and competition’. So for now and probably for ever, we cannot assume that Cooper et al explain changed ‘outcomes’ anymore definitively than the (even) less plausible research by the OHE/Aberdeen which sought to show that the tariff in England had certain effects earlier in the 2000s – as a result of a ‘boffin’s black box’ comparison of England and Scotland. (Er….that was also the high noon of targets in England yet not (yet) in Scotland….doh!)
Hypothesis: a combination of various things produced the Cooper results: different SHAs’ policies towards rationalisation of services (contaminating variables such as ‘concentration’ and ‘degree of competition, probably not randomly); the fact that quality came on the national radar screen from this time onwards (as Stafford et al began to brew in the background…and paradoxically as the ‘deficit crisis’ of 2006 led to cuts in acute hospital budgets BUT good old central diktat to ensure that A and E quality did not suffer….
Not proven….but at least a candidate to beat a candidate, to quote the late Milton Friedman (even scripture can quote the devil!)
And let’s say that it was competition, or reputational concern in a more competitive age, which produced the ‘goods’ which Zak et al document. What does that tell us? It tells us that we should allow patients, with their GPs, to go to whichever hospital/provider they wish, subject to what is available within a cash-limited budget and an NHS which rightly takes account of the need to allow, indeed provide, inter-provider clinical networks (both acute-acute- and acute-primary-community-social ie ‘integrated care’). None of this necessitates the ‘top down re-disorganizations’ favoured by New Labour to stumble down the neo-liberal route, nor the mother of all such ‘top downers’ ie the Lansley Messes Marks 1 and 2.
When you add in all the costs of ‘achieving competition’ Blair-style or Lansley-style, then I suspect that each of Cooper’s ‘lives saved’ falls way below the QALY funding theshold…..!
Which is another of my beefs about ‘cost-effectiveness’ of health reforms: studying them at hospital/provider level usually underestimates the systemic costs ‘catastrophically’!
Incidentally, Julian is right to say that believers in competition should urge that the latest re-disorganisation be abandoned….and Alan Milburn and the other New Labour neo-libera;s seem to think the same. I come from the other side of the fence – abandon the Bill (although of course it won’t happen) and then allow choice within an integrated public system where the money follows the patient without market paraphernalia.
Markets without choice are common (very common!) So let’s hear it for choice without markets.
Well, I’ve read both sides of this argument and I’m afraid to say, having worked in statistical analysis for over 30 years and being familiar with every possible interpretation permutation, Zack Cooper’s methodology holds little water at all as a reliable basis for proof. For an LSE economist, I find his analysis lamentable shoddy.
I think there a number of points I’d like to raise here – but time is short and I don’t want to fill the up entire comments space.
First of all, Cooper et al’s work (hereafter ‘Cooper’) has been used to justify the present NHS reorganisation. I don’t know if Cooper supports these changes, but the PM has certainly made us his work. It therefore seems to me that he has a significant duty to make sure that the claims he is making are proportional. Are they?
Well if you are going to claim that ‘competition saves lives’, you’d better have some pretty significant proof. I would maintain, I’m afraid, that he doesn’t have this. What he has is some interesting outcome data post-2006. He says has has economic theory explaining how this could have led to those improved outcomes, but he hasn’t investigated that causal chain at all – he simply doesn’t know it is there. This isn’t just my view – it is also the view, for example, of Gwyn Bevan at the LSE whose review of Cooper’s research in the BMJ pointed out its ‘black box’ nature, and was quoted in the Guardian on March 6th as saying he was ‘perplexed’ by the focus on choice and competition in the NHS reorganisation because ‘the evidence is very weak and contested’.
I’m grateful that Cooper and his colleagues refer readers of this blog to a piece of mine – I’m always happy to be cited. It would have helped, however, if they had got the name of the journal correct (it’s called ‘Public Money and Management’, and is one of the several pieces I’ve published on choice in the last ten years). This particular piece, had they dug a little deeper, was part of an NIHR-funded project where we looked at several sites across the country, and I was reporting on the basis of just one of them. I wasn’t claiming, unlike them, that either choice or its absence has saved any lives. From spending nearly two years in one should have been a highly competitive area (according to their measures) of the country, I was reporting that I’d found no evidence of it – in fact exactly the opposite. This approach is a little different to looking for associations between reported management practices that have been labelled as good, and hospital performance. Perhaps the researchers aren’t very familiar with qualitative research?
I’ll make a few more points and stop here. First, DiD is not at all like an RCT. RCTs specify there theories (or put in place their interventions) and test data in line with it. As such, they are able to make predictions about what effects interventions might have, which can themselves be tested. The author’s use of DiD allows them to predict the past, but I’m not aware of them having made any predictions for the future. Perhaps the way to silence critics is to do so? Perhaps they’d like to tell us which areas of the country will have the biggest improvements in health outcomes in 2012/13? They might not want to do this because of all the confounding factors in respect of health reorganisation – but I’d contend the NHS has been in almost contentious reorganisation since at least 2006, so I do think that argument stands.
Finally, I’m genuinely interested in why the authors continue refer to economic theory to justify their idea that moving from local monopolies (which is not really the case) to a greater number of providers (which may or may not be in a competitive environment) is supported by economic theory. I was taught that oligopolies, which are largely the market structures in place in most health areas, have very little competition going on with them. Why would this generate improvements, especially when there is virtually no chance of a comprehensive public hospital being allowed to fail?
Bad science comes about when academics claim they have simple answers for policymakers, and don’t have strong evidence to support their claims. Exploring what happened post-2006 in the NHS requires a range of different methods and understandings. I’m very happy to say that Cooper has found some interesting outcomes – but to think he has explained what caused them is deeply premature. His piece in the FT was extraordinarily dismissive of anyone doing research outside of his own paradigm. That is no way to generate a better understanding of the complex problems we face in trying to improve the NHS.