Criticisms have been made of recent influential studies that show improving performance in hospitals operating in more competitive environments compared with hospitals that have a local monopoly on care. Zack Cooper, Steve Gibbons, Simon Jones and Alistair McGuire set the record straight. The claims by Pollock et al are based either on distortions of the original research, or on an apparent lack of understanding of modern economic analysis.
In a recent blog post, Allyson Pollock, Alison Macfarlane and Ian Greener misrepresented our work and built straw man arguments designed to undermine our findings. This might pass as political discourse, but is it academic debate?
Perhaps these objections arose because of the publicity that our work has received, rather than the detail of the research. But our goal as academics is to produce the most rigorous research possible, with the further goal of promoting evidence-based policies. When we produce evidence that can have an impact on policy, we present this work and make it available to policy-makers, press and the public. Researchers in publicly funded universities are expected to publicise their work in a timely manner to policy makers. How else should good policy be formed or research funding justified?
Pollock and her co-authors decried ‘The drip feed of pro-competition studies’, we have produced. In fact, there are just two studies. Our first study in the Economic Journal (EJ) looked at the impact of competition – by which we mean a move to less monopolistic local markets – on quality. This work illustrated that competition between NHS providers in a market with fixed prices led to better outcomes. Our findings were consistent with what economic theory would predict and they mirror precisely the academic literature from empirical research in the US. More than that, since our research came out, two subsequent studies by separate research teams (Gaynor et al and Bloom et al) have found nearly identical conclusions about the positive impact that fixed price competition has had in the NHS.
Our second paper looked at the impact of this competition on patients’ length of stay (and was an expansion of an earlier paper). It showed that NHS providers in competitive environments shortened their pre-surgical and overall length of hospital stays (which we regard as evidence of improvements in efficiency). In contrast, the net effect of the introduction of private providers into the market was to increase the average length of stay in NHS hospitals and is potentially suggestive of cream-skimming. This latter finding is not overtly pro-competitive. These studies provide precisely the kind of evidence that policy makers look for, so that they can learn about what has worked and not worked in the past, in order to chart a sensible path forward. This is why they have had a significant impact. Of course, wide-reaching policy should not be set on the basis of one study. However, as a body of evidence grows, the case for policy action becomes more persuasive.
What about the counter-arguments? So far, critics have not articulated a theory as to how fixed price competition could undermine quality in the NHS. They have presented no evidence of their own that competition has harmed patient outcomes.
To be fair, Professor Greener has done work in this area. For example, in 2009, he published an article in a journal called Public Money and Management titled, “Patient Choice in the NHS: What is the Effect of Choice Policies on Patients and Relationships in Health Economies”. This ethnographic study presents insights into the attitudes of hospital managers and staff in one NHS trust. But is it ‘good science’ in contrast to our work which he and Pollock call ‘bad science’ and which they have criticized in the Lancet?
Greener’s piece drew on 60 semi-structured interviews of NHS staff at a single NHS hospital; no patients were interviewed during the course of the research. From these interviews, he concludes that: ‘The case presented suggests that patient choice policies fall short on all of the conditions that are necessary for them to work. Patients in the case study were reluctant to exercise choice decisions’. Here, Greener is happy to use a qualitative style of research (interviews at a single hospital) to draw conclusions of his own against a national policy.
However, when other researchers use qualitative research together with quantitative evidence to show that competition can have positive effects, there is less tolerance. In their co-authored Lancet comment piece attacking our research competition, Pollock et al. dismissed work by Nick Bloom, Carol Propper, John Van Reenen and Stephen Seiler, stating disparagingly that in their study: “An association with management quality is based on interviews with 161 senior staff that did not take account of relevant causal factors’. Bloom et al. involved interviews at 100 hospitals and integrated quantitative work with advanced econometrics. If this is ‘bad science’, what are we to make of the critic’s own work, which adopts a related approach?
Henry Overman has kicked off a good conversation on this blog about what constitutes sensible blogging – we hope this discussion continues. Hopefully such a debate, plus our reply here, will provide a teachable moment to pause and reflect on how academics discuss evidence, consider the casual use of phrases like ‘bad science’ and begin a thoughtful discussion of the role of blogs in academic and policy debates in the social sciences.
In the Annex below we give a further point-by-point rebuttal to the criticisms of our work. It is worth noting that Professor Pollock has raised these points before and we responded to her points twice, both in a Lancet letter, and in freely accessible online 8-page document (a detailed response that Pollock et al. do not mention) posted online, also included as a linked appendix to our Lancet reply.
Please read our comments policy before posting.
Zack Cooper is a health economist at LSE Health at the London School of Economics. Zack’s work focuses on assessing the impact of competition in hospital and insurance markets and analyzing the effect of financial incentives and payment reforms on health care delivery.
Steve Gibbons is a Reader in Economic Geography, Department of Geography and Environment at the London School of Economics. He is also Research Director of the ESRC/BIS/WAG funded Spatial Economics Research Centre.
Simon Jones has a joint appointment as a Visiting Professor within the Department of Health Care and Policy at the University of Surrey and as Chief Statistician at Dr Foster Intelligence.
Alistair McGuire is Professor of Health Economics in the Department of Social Policy at the London School of Economics
ANNEX: DETAILED RESPONSES TO POLLOCK ET AL’S CLAIMS
Claim 1: ‘The major improvements in outcomes after acute myocardial infarction can be attributed to improvements in primary prevention…’
In their third paragraph Pollock et al argue that observable reductions in AMI death rates were attributable to improvements in AMI care itself, including ‘primary prevention in general practice and in hospital care, including the introduction of percutaneous IV angiography’. The implication is that competition does not explain the wide spread decrease in AMI mortality during this period.
Our response: Our research does not examine the causes of widespread reductions in AMI mortality during this period; and it does not claim that competition was the main factor leading to falling AMI mortality. Instead, our study is investigating whether AMI mortality fell more quickly in more competitive areas after choice was introduced in 2006 (set against a background of these larger and more general changes). We illustrate that there was no difference in the rate of mortality reduction in more competitive versus less competitive areas prior to 2006, before competition was introduced. We then illustrate that mortality fell more quickly in more competitive areas after patients in England had choice and hospitals in England had to compete from 2006 onwards.
We state explicitly in our EJ paper on page 243 that: ‘Hospital quality, measured by 30-day AMI in-hospital mortality, improved consistently from 2002 to 2008 [the period pre-dating widespread patient choice], as shown in Table 2. Likewise, the number of AMIs treated per year also fell. This reduction in mortality and reduction in overall AMI occurrences is consistent with international trends and is driven, in part, by increasing adoption of new technology in the treatment of AMI and improvements in public health…’.
Our main finding is that prior to 2006, there was no difference between the rate of mortality reduction in more or less competitive areas, but from 2006 onwards, after competition was introduced, more competitive areas reduced their AMI rates more quickly. Hence, we say: ‘Ultimately, we find that after the introduction of these reforms in 2006, our marker for service quality (AMI mortality) improved more quickly for patients living in more competitive hospital markets’ (page 229).
Claim 2: ‘Not all patients were aware that they could choose and that AMI was an emergency procedure’.
Here, Pollock et al make two claims related to the impact of patient choice on AMI outcomes. First, they claim we simply assume that choice occurred, and that since only 49 per cent of patients recalled being offered a choice, this was not enough to create substantially strong incentives for providers to improve. Secondly, they cite a quote by Roger Boyle saying: ‘It is bizarre to choose a condition where choice by consumer can have virtually no effect’.
Our response: A 2010 King’s Fund Report by Dixon et al ‘found that just under half (49 per cent) of patients recalled being offered a choice’. Yet this fact does little to suggest an absence of financial incentives for hospitals. Consistent with the wider industrial organization literature, we also believe that if 49% of a hospitals’ patients were offered choice, this creates sufficiently large financial incentives for hospitals to improve quality. More than that, the threat of patient choice alone is also likely to create an incentive for providers to improve their performance and by a wide majority, NHS patients are in favor of having choice over where they receive care. Evidence from the British Social Attitudes Survey shows that when asked, 75% of the British public say they want the ability to select their hospital (Appleby and Phillips, 2009). On page 233 of our EJ paper, we outlined the mechanism through which competition should improve hospital quality, including highlighting the role that GPs (as opposed to patients themselves) played in these reforms.
On the critics’ second point, that AMI mortality may not be impacted by competition in the elective market, this is something we address in multiple places throughout the paper. On page 234, we write: ‘While providers are not explicitly competing for AMI patients because competition in the NHS is limited to the market for elective care, we expect the market-based reforms to result in across-the-board improvements in hospital performance, which in turn will result in lower AMI death rates. To that end, Bloom et al. (2010) looked at NHS hospitals and found that better managed hospitals had significantly lower AMI mortality and that greater hospital competition was associated with better hospital management’. In addition, in Appendix A, we pen a mathematical explanation of this for more technical readers.
Claim 3: ‘Crucially, even if patient choice had occurred it does not explain why heart attack mortality rates fell. There is no biological mechanism to explain why having a choice of providers for elective hip and knee operations surgery …could affect the overall outcomes from acute myocardial infarction’.
Here, the claim is that there is no biological mechanism through which giving patients a choice and increasing incentives for hospital quality could lower heart attack death rates.
Our response: It is surely clear from reading the paper that the theoretical mechanism we consider is that the extension of choice created financial incentives for hospitals (and their staff) to improve the quality of their services, and that improvements in one area of service will have an impact on quality in other areas. In short, competition improved care quality, and improved care quality lowered death rates.
Claim 4: ‘they seem unaware that lengths of stay differ between the conditions they examine’.
Here, Pollock et al note (rightly) that there are different lengths of stay between hip replacements, knee replacements, arthroscopies and hernia repairs. They argue (wrongly) that this was not controlled for in our most recent work and that this could be biasing our results.
Our response: As we note on page 12 of our paper, and on every table that includes regression results in our analysis, we include ‘fixed effects’ for procedures, which control for different length of stays between procedures (i.e. within procedure differences over time). Our results are robust with and without these controls, and are similar for each of the procedures individually. We choose to pool the data and to include fixed effects to allow for a neater presentation of our results.
Claim 5: Correlation versus causation
Pollock et al argue that ‘the authors make the cardinal error of not understanding their data and of confusing minor statistical associations with causation’.
Our response: In line with much modern applied work in economics, our methods work to mimic the conditions of a controlled experiment, given that such explicit experimentation is infeasible in real world policy analysis. Articles like Ibens and Wooldridge (2009) and Angrist and Pischke (2010) give a sense of how the methods we use are aimed at measuring causation, for anyone interested who is not familiar with these ideas.
These articles discuss empirical techniques used in statistical analysis that are designed to assess the causal relationships between policies and outcomes. Both discuss difference-in-difference regression, which is the technique used in our research papers. This statistical technique is designed to identify causation, and it is frequently used in policy analysis and program evaluation. Of course, no empirical analysis – other than, perhaps an ideally designed large-scale random control trial – can ever unequivocally disentangle correlation from causality. But researchers have to work with what methods and data are available in order to try to say something useful about causality.
Difference-in-difference (DiD) regression is a real world application of the same statistical techniques/theory used in randomized control trials (RCTs). The key assumptions underpinning our DiD regression are that:
(i) There were no differences in acute myocardial infection (AMI) rates or in length of stay trends in our treatment group of hospitals (operating in less concentrated markets) and the control group of hospitals (operating in a monopolistic market). And
(ii) The policy (in our case the introduction of patient choice into these markets) was not introduced at exactly the same time as other policies that would have also improved outcomes in the treatment group of hospitals relative to the control group.
These are assumptions we tested directly in the paper. We illustrate that there were no pre-reform differences on performance for hospitals in competitive versus non-competitive markets. Likewise, we measure market structure in 13 separate ways and show that the results are similar in each approach, despite the fact that our measures of market structure are not heavily correlated with each other. In addition, we show that our results were not driven purely by urbanization, by testing directly whether urban areas did better after 2006 (they did not). DiD regression has a long history in policy analysis. Indeed, it was first used in a seminal public health study by Snow (1855) looking at the causes of cholera in London.
Claim 6: ‘Deaths from acute MI are not a measure of quality of hospital care, rather a measure of access to and quality of cardiology care’.
Here, the claim is that AMI mortality is not a measure of hospital quality and that it is a measure instead of access to the hospital and specifically of cardiology services.
Our response: Again, this ignores evidence presented in our EJ paper where we outline our rational for using AMI as a quality indicator on page 237, and on the next page show that hospitals with higher AMI death rates also have higher overall death rates, higher length of stay for elective hip and knee replacements, and higher waiting times. Bloom et al show that better managed hospitals have lower AMI mortality and higher patient satisfaction. This is why AMI mortality has been used as a measure of hospital quality in a number of studies, including Kessler and McClellan (2000), Volpp et al., (2003), Propper et al. (2004), Kessler and Geppert (2005) and Gaynor et al. (2010). These are not simply economics articles written by empiricists unversed in medicine: Mark McClellan and Kevin Volpp are both trained medical doctors (in addition to being trained economists).
Claim 7: ‘Equally, the authors do not look at how clinical coding changed following the introduction of the tariff in 2006’.
Here Pollock et al imply that changing coding practices could be producing the results that we observe and that this is something we ignore.
Our response: Our methodology allowed for multiple coding of the same procedure. In our paper, we have included at least three separate OPCS 4 codes for each procedure, specifically to allow for the possibility that Pollock et al raised, which is that the same procedure was coded differently over time. We bundled all procedures into one master heading – i.e. hip replacements – and we did not look at procedures defined using a single code.
Criticism 8: ‘Finally, length of stay is also a product of a range of factors related to the conditions in their data; pre-operative work for hip and knee replacement needs to take account of rurality and patient fitness for discharge, especially if patients live alone and have other co-morbidities and complexities’.
The critics say here that it is imperative to control for the ‘rurality’ of patients and their underlying co-morbidities.
Our response: Our paper controls for these factors. We initially introduced controls for co-morbidities using the Charlson index of co-morbidity. However, when they did not alter our main estimates, we chose not to include them since the counting of co-morbidities during this period was potentially subject to up-coding. We control for socio-economic status using the income vector of the 2004 Index of Multiple Deprivations and additionally control for patients’ gender and age. ‘Rurality’ is controlled for by including a measure of the distance from the patients’ registered GP to the hospital where they received care and by using hospital fixed effects. It is also controlled for in the way we design many of our market structure indices, which take into account actual GP referral patterns, travel times, and population density.
Claim 9: ‘they ignore the political context in which the data was generated’
The claim here seems to be that the Health Episodes and Statistics Data is not accurate and that the ‘political context’ in which it was collected undermines its accuracy.
Our response: For any such ‘manipulation’ to affect our results, there would need to be systemic manipulation of the data such that hospitals located in competitive areas deliberately reduced their recorded length of stay data by manipulating their admission and discharge dates. Furthermore, they would have needed to do so in such a way that it happened to coincide precisely with the introduction of choice and was correlated with the three measures of market structure we used in our efficiency paper and the 13 measures we used in our EJ paper.