As citation counts, h-indexes, and impact become increasingly important to matters of funding and promotion, Melissa Terras asks why more scholars are not chasing up publishers to find out how their work is faring among the online audience, and makes some pleasing discoveries on how her own research has been received.
A month or so ago, I posted about whether blogging and tweeting about academic research papers was “worth it”. Whilst writing up my thoughts, the one thing that I found really problematic was the following:
“I also know nothing about how many times my other papers are downloaded from the websites of published journals, or consulted in print in the Library. The latter, no-one can really say about – but the former? It seems strange to me that we write articles (without being paid) and we get them published by people who make a profit on them, then we don’t even know – usually – how many downloads they are getting from the journals themselves.”
That’s true enough, I thought. But whose fault is it that I don’t know about access statistics for journals I have published in? Heck, have I ever asked for the access statistics for how many times my papers have been downloaded from the journals they are published in? Has anyone?
So, Reader, I asked for some facts and figures, regarding the circulation of journals, and the download statistics of my papers. I have to say that the journals were really very helpful, and forthcoming, if surprised:
“I imagine the publishers would be happy to tell an author the cumulative downloads for their papers… So far as I know, you are the first author ever to ask… certainly the first to ask me“, said David Bawden, Editor of the Journal of Documentation. Jonas Söderholm, Editor of HumanIT, highlighted some of the issues journals will face if people start asking this kind of question, saying:
“A reasonable request and we would gladly assist you. Unfortunately we do not have direct access to server logs as our web site is hosted as part of the larger University of Borås web. We will take your request as a good excuse to check into the matter though, and also review our general policy on log data.”
Most journals got back to me by return of email, telling me immediately what they knew (and being very aware of the limitations of their reporting mechanisms, for example whether or not the figures excluded robot activity, the fact that how long the user stays on the website is not known so accidental click-throughs are undetermined, etc. Such caveats were explained in detail). Emerald, the publishers of JDoc and Aslib Proceedings, were not comfortable in giving me access to wider statistics about their general readership numbers, given this could be commercially sensitive information, which is understandable: they were very happy to give me the statistics relating to my own papers, though.
The only journal not to get back to me was LLC , published by Oxford University Press (The editor replied to say he was not sure he had access to these statistics, but would ask). This is ironic, given I’m on the editorial board. I’ll press further, and take it to our summer steering-group meeting.
I suspect that the actual statistics involved are only really very interesting to myself. I had originally planned to make comparisons with the amount of downloads from UCL Discovery (Open Access (OA) is better, folks!) but I think the picture is foggier than that. What this exercise does do is highlight the type of information that, as authors, we don’t normally hear about, which can be actually quite interesting for us, as well as stressing the complex relationship between OA and paywalled publications. Here are some details:
- One of my papers published in JDoc (Ross, C and Terras, M and Warwick, C and Welsh, A (2011) Enabled backchannel: conference Twitter use by digital humanists. J DOC, 67 (2) 214 – 237) was downloaded 804 times from the JDOC website during 2011, and was number 16 in the download popularity list that year. The total number of paper downloads from JDoc as a whole during that year was 123,228. Isn’t that interesting to know? I have a top 20 paper in a really good journal in my discipline! Who knew? It has now been downloaded 1114 times from their website. In comparison, there have been 531 total downloads of that paper from UCL Discovery in the past 6 months. But the time frame for comparison of downloads with the OA copy from Discovery isn’t the same, so comparing is problematic – and there are more downloads from the subscription journal than from our OA repository. Still, it shows a healthy amount of downloads, so I’m happy with that.
- The Art Libraries Journal – only available in print, not online, were quick to tell me that the journal is distributed to 550 members: 200 going abroad to Libraries/Institutions, 150 sent to UK Personal members, and 200 going to UK Libraries/Institutions. My paper published there (Terras, M (2010) Should we just send a copy? Digitisation, Use and Usefulness. Art Libraries Journal, 35 (1)) has had 205 downloads in the last six months from UCL Discovery, so I perceive that as a really good additional advert for OA: the print circulation is fairly limited, but the OA copy is available to all who want it.
- My paper in the International Journal of Digital Curation – itself an OA journal – (Gooding, P and Terras, M (2008) Grand Theft Archive: a quantitative analysis of the current state of computer game preservation. The International Journal of Digital Curation, 3 (2)) was downloaded 903 times in 2009 out of the 53,261 times the full text of a paper was accessed. (The average was 476, with standard deviation 307). In 2010 the paper accounted for 919 out of the 120,126 times the full text of a paper was accessed. (The average was 938, with standard deviation 1045.) That compares to only 85 downloads from the UCL repository, but hey, its freely available online anyway, without having to revert to an OA copy in an institutional repository. It might be worth drawing from this that copies of papers in institutional archives are only really used when the paper isnt available anywhere else, but you would hope that would be obvious, no?
- InternetArchaeology journal has an online page with their download statistics readily available (how I wish all journals would do this). The journal gets around 6200 page requests per day. But since article size varies widely, with some split into 100s of separate HTML pages, it is difficult to know how meaningful this is. I was sent a spreadsheet of the stats from my paper published there (Terras, M (1999) A Virtual Tomb for Kelvingrove: Virtual Reality, Archaeology and Education. Internet Archaeology (7)) which suggests that there have been 2083 downloads of the PDF version of the paper from behind the paywall since 2001 (but some may be missing due to the way the reporting mechanism is set up) with none in the past year (compared to 276 downloadsof this from UCL Discovery in the past six months, so many more from our institutional repository comparing like on like periods).The HTML version of the table of contents has been consulted 16, 282 times since 2001 (this is freely available to all comers) but there have been 67, 525 views of all files in the directory since then – but since the paper is comprised of hundreds of individual files, its difficult to ascertain readership. Judith Winters, the Editor of Internet Archaeology, notes “It is curious that when the journal went Open Access for about 2 weeks towards the end of last year, the counts did increase but not dramatically so” – so when a non-OA journal throws open its doors for a limited time (IA did this to mark open access week last year) its not like access figures go wild. That’s really interesting, in itself.
If you are still reading, then thanks. This stuff gets pretty turgid. But its been fascinating, for me, to see the (mostly positive) reactions publishers have to being approached about this – and surprising that not more people have actually asked publishers about these statistics. We are giving away our scholarship to publishers, in most cases: shouldn’t we get to know how it fares in the wide, wide world? As citation counts, and h-indexes, and “impact” become increasingly important to external funding councils and internal promotion procedures within universities, why would journal publishers not make this information available to authors? But why don’t they do it more routinely?
Will you need this type of information for the next grant proposal, or internal promotion, you chase? Why would you not be interested in how your research flies? But journal publishers will only start providing authors with this kind of information routinely if enough scholars start to ask about it, and it becomes part of the mechanics of publishing research – particularly when publishing research online.
So if you have published in a print journal which has an online presence, or in an online journal, drop them an email to ask politely how your downloads are going*. Do it. Do it now. Ask them. Ask them!
*Perhaps someone online can provide some input as to whether such a request comes under the rights of individuals in the Data Protection Act in the UK. If you are a named author on a journal article, does access statistics about that journal paper count as personal information?
This post is also published today on Melissa Terras’ personal blog. Read her verdict on the value of social media in publicising academic work here.
Note: This article gives the views of the author(s), and not the position of the Impact of Social Sciences blog, nor of the London School of Economics.
Excellent post, thanks! One interesting issue that comes up fairly often with regard to institutional repositories and open access is academics asking quite reasonable questions along the lines of “I am only interested in download stats from journal x, and making my paper OA will detract from those downloads, so why should I bother?”. I think the analysis above refutes this- people will tend to use the OA version of a paper only when they have no other option, and “real” download stats aren’t eroded by the availability of an OA version.
Of course the challenge then becomes aggregating these stats- and the JISC have been running a project on this, called PIRUS 2 (probably of interest only to hardcore stats nerds and library types!).
Thanks – an interesting article. You might like to know that some journals (eg. Trials, which is a Biomed Central journal) send out emails to authors saying how many times their article has been accessed from the Trials website – maybe other publishers will start doing that.
Interesting blog. But there’s still a problem – it’s much easier to download published research than to actually read it! I do seriously wonder whether the exploding volume of published research actually results in more papers being read, and so contributes to the research process… or not.
There is now a very powerful tool called Total Impact (http://total-impact.org) that you can use to quickly obtain aggregated data on your publications. TI aggregates citation counts as well as alternative metrics (e.g. social bookmarking, wikipedia entries, blogs, download counts for PLoS artcles, etc.). For example, I grabbed one of your papers (I hope 🙂 and popped it into TI to get the following result: http://total-impact.org/collection/YzAQRy
The value of Total Impact results depends on the datafeeds that each publisher makes available, so mileage will clearly vary for different disciplines. However, it is worth giving Total Impact a go since it is so easy to use and can turn up some interesting results.