Is the elusive nature of data just part of the normal challenges of research life? Or, should we look forward to a time when relevant data is easily accessible to the laziest internet browser? Peter John asks if data searching could be easier than finding that needle in the haystack.

In part I of this blog, I reported some of the key findings of the UK Policy Agendas Project, and gave a link to our website so that researchers and others can find the data. We hope the findings and data will be useful for years to come. In truth, however, the use of these data will depend on word of mouth, serendipitous web searches, and readers scanning the footnotes of our academic papers. This is typical of data sources in British politics, which appear for particular reasons and from time-limited funded projects, and are then left on websites or in the UK Data Archive or even on hard drives. These resources are hard to find without some prior knowledge of the work. 

So if you are interested in knowing which minister resigns then you need to know in advance about the Ministerial Resignations Project, coordinated by Keith Dowding (ANU) and Patrick Dumont (Luxemburg). If you type ‘ministerial resignations’ into a search engine, you will find the first result is an actual resignation; next down in the search results you will see a list of resignations kept by the UK parliament (which might tempt you to start coding right away and waste time and energy. Hands up all researchers who have coded data and then found it has already been collected by someone else? My hand is well and truly up!); then it is only in item six that you find a codebook on ministerial resignations on the project. If you do not know anything about the topic you might breeze by this. There is an Oxford handbook in entry seven, but again if you are not primed you might skip past. 

I can repeat this exercise with a number of examples. If you want to know UK budgets by function, you need to know that Chris Wlezien and Stuart Soroka had a Nuffield Foundation funded project on this topic and deposited the data in the UK data archive. Just typing ‘UK budgets’ or ‘UK expenditure’ into the archive’s search facility won’t help either as it pulls up the family budget surveys. In fact, the only easy way to locate the Wlezien/Soroka dataset in the archive is to type in the investigator’s name, which again means you must have prior knowledge to find the data. In fact, you are much better off using the website ‘Degrees of Democracy’ where the data are there on the screen ready to download. 

British election studies is much better, partly because of long-running surveys such as the British Election Survey, and also owing to the altruism of those who provide data in an easy to download format, especially Pippa Norris. But you still need to know in advance what is in those websites. For example, area-level local election results are not on the highly useful website run by Michael Thrasher and Colin Rallings. Once you get this Access dataset from the data archive, it is easy to use – but you need to know in advance about it.

It might be the case that the elusive nature of data sources in British politics is part of the normal challenges of research life. I am also going to admit to being clunky, myopic and grumpy when searching on the internet. Also anyone can contact academics to ask about data and usually get helpful responses. On the other hand, I do think it could be easier to find data, especially in an age where we expect all information to be ready in one or two clicks. Like the proponents of Nudge, we should expect citizens to be just as lazy as I am and for the most incompetent internet searcher and website user to be able to get to what she or he wants to go without too much bother. We need all the data in one place, on a large website to rule all the others, or at least a tool that can help you find the data.

You are probably thinking that it is all very well for Peter John to take sideswipes about the provision of data. What exactly is he doing about it? Moreover, what can be done about it? Has not what he is saying being articulated much more eloquently before (see Patrick Dunleavy (2010), ‘New worlds in political science.’, Political Studies, 58 (1), pp. 239-265)? Political science in the UK does not have much unity in its approach, unlike studies of the US Congress or in economics where it makes much more sense for entrepreneurs to provide data. Something like this did start with the innovative Keele Political Resources, but these kinds of websites need a lot of maintenance and can get diverted into providing a large number of website links, making it even harder for the researcher who hunts for data. 

So my questions are: a) am I right to characterize the typical search for data in British politics as the proverbial finding needle in a haystack; b) could searching be a better experience, helped by a website that contains all data in easy to download excel files or a tool that links the websites; and c) who will provide such a website/tool and maintain it long-term? I have a horrible feeling that the finger will get pointed at me, and then like any other academic I am worrying about the two books I am co-writing this summer! I guess this is another example of the collective action problem.

