Currently, there is little exchange between the different communities interested in the domain of bibliometrics. A recent conference aimed to bridge this gap. Peter Kraker, Katrin Weller, Isabella Peters and Elisabeth Lex report on the multitude of topics and viewpoints covered on the quantitative analysis of scientific research. A key theme was the strong need for more openness and transparency: transparency in research evaluation processes to avoid biases, transparency of algorithms that compute new scores and openness of useful technology.
We are currently witnessing a transition in scholarly communication from an offline, paper-based mode to a digital and online mode. New and complementary means of communication have arisen in the form of social networks and other online collaborative environments. New forms of communicating also pose new challenges to the field of bibliometrics, the science of studying and measuring scholarly communication and scholarly reputation. Key concepts of bibliometrics (such as citation indexes, co-citation analysis, maps of science etc.) are being transferred to the new Web environments. This development also led to the formation of the term altmetrics to describe the study of alternative indicators for scholarly communication, such as social media mentions (e.g. tweets) referring to scholarly articles or bookmarks in academic social bookmarking systems.
Given these developments, it is hardly surprising that we see an increased interest in bibliometrics from computer scientists and related disciplines. In addition, there has been a longstanding tradition of bibliometric research in several fields, with the medical and the physics community being probably the most active. Yet, basic theories and models relevant for bibliometrics are developed in science and technology studies (STS).
At the same time, metrics are becoming ubiquitous in research assessment, as evidenced in the recent HEFCE report “The Metric Tide”. Even though researchers have called for sensible use of metrics in research evaluation, e.g. in the San Francisco Declaration of Research Evaluation DORA, and endeavours promoting researcher portfolios, in which the researchers themselves choose which of their research output is important and for what reason (e.g. ACUMEN), metrics are more widespread than ever, even in public discourse. Drivers of this development are, amongst others, policy makers that increasingly ask for evidence on the impact of research.
In the wake of these developments, the bibliometrics community has become more committed to informing stakeholders on practical implications of its research, exemplified by the recently published Leiden Manifesto for research metrics. With bibliometrics research being more open and outward and increasingly the center of attention of many disciplines, it seemed more than apt to bring together different communities that do bibliometrics research to find out what they can learn from each other.
Image credit: Jopa Elleul CC BY-NC
This was the goal of the workshop “Quantifying and Analysing Scholarly Communication on the Web (ASCW’15)”, which was made clear by its tagline: “What can bibliometrics do for you? What can you do for bibliometrics?”. When studying Web-based scholarly communication, bibliometricians increasingly need skills from computer science, such as collecting and processing large amounts of heterogeneous data. On the other hand, computer scientists may not be aware of the issues of quantitative analyses of scholarly communication and would benefit from the knowledge of information scientists. Currently, there is little exchange between the different communities interested in the domain of bibliometrics, which is exemplified by countless parallel research efforts when it comes to supporting and understanding scholarly communication on the Web. This effect is reinforced by the diverse publication outlets the different communities serve.
The workshop aimed at bridging the gap and was therefore held at this year’s Web Science Conference in Oxford. The conference has a distinct interdisciplinary approach to studying phenomena on the Web, and has also served as an incubator for the altmetrics community in the past. We organized the workshop with two principles in mind: open science and a focus on discourse. We first called for the submission of position papers, which were published on our website after an initial editorial check. Then, we invited our program committee and the whole research community to an open peer review. In a parallel effort, we invited response papers to all position papers. All papers and responses are available under a Creative Commons license from our website here.
Which issues were discussed?
We received a number of highly relevant contributions from a variety of disciplines (psychology, computer science, political science, educational technology, bibliometrics) that lead up to an equally diverse workshop.
Under the theme “What can bibliometrics do for you?”, Alan Dix (Talis & University of Birmingham) kicked off with a citation analysis of the results of the 2014 edition of UK’s Research Excellence Framework (REF2014), which is solely based on expert assessment. In his analysis of the discipline computer science, Alan found that there was a latent bias against applied research fields such as web science. In the expert assessment, these fields had received far fewer 4* ratings (indicating world-class research) than what citation analysis would have predicted. Vice versa, more theoretical areas such as logic fared much better in the expert assessment than what the citation analysis would have predicted. In his response, Robert Jäschke (L3S), discussed some of the possible explanations for this latent bias, such as a halo effect when assessing papers from institutions that are perceived to be excellent. He suggested to extend this research to other disciplines in the REF, and to involve altmetrics in the evaluation.
Next up, Peter Kraker (Know-Center) presented a critical assessment of the ResearchGate Score, which was co-authored by Elisabeth Lex (Graz University of Technology). In the evaluation, they found that the ResearchGate Score has serious shortcomings, including that the score is intransparent and that the algorithm is changing over time with no indication as to what had changed. Furthermore, it includes the much-criticized journal impact factor to evaluate individual researchers. They conclude that the ResearchGate Score should not be considered in the evaluation of academics in its current form. In her response, Katy Jordan showed that even though the ResearchGate Score is intransparent, it is possible to reverse-engineer the score to a large extent using multiple regression. She presented a model that explains over 95% of the score.
After these talks, we switched to the theme “What can you do for bibliometrics?” to see how current approaches from computer science are finding their way into bibliometric research. Ryan Whalen (Northwestern University) presented a paper which was co-written with his colleagues from Northwestern University Yun Huang, Anup Sawant, Brian Uzzi, and Noshir Contractor. They advocate the use of textual data and natural language processing methods to develop new bibliometric methods. To illustrate their point, they show how accounting for topical diversity of citations can improve impact predictions: citations from both topically distant and proximate papers provide more insight into an article’s impact potential than those from papers with middling similarity. In his response, Brett Buttliere (Knowledge Media Research Center) provided a number of ways how this method may be improved and expanded on. For example, he suggested using social media data and the keywords therein following Whalen’s method. He also discussed the potential benefits of using more theory from fields such as psychology in bibliometric research and how to nuance our idea of impact to more than one dimension to create better metrics. He expanded on these thoughts in a recent LSE Impact Blog post.
Here is the overview of all workshop papers with their responses:
- Dix, A: Citations and Sub-Area Bias in the UK Research Assessment Process
- Kraker, P. & Lex, E.: A Critical Look at the ResearchGate Score as a Measure of Scientific Reputation
- Whalen, R., Huang, Y., Sawant, A., Uzzi, B. & Contractor, N.: Natural Language Processing, Article Content & Bibliometrics: Predicting High Impact Science
General themes emerging from the discussion and conclusions
The three presentations helped to reflect upon three core players in the current discussion of bibliometrics and altmetrics: policy makers, platform providers, and researchers – both as the subjects of bibliometric analyses and as the conductors of bibliometric research. ASCW’15 mainly assembled the third group, whereas extending the dialogue to include policy makers and platform providers has been approached in other venues. But this event showed that once you put metrics researchers from different disciplines in the same room, good things happen. Despite – or more likely because of – the multitude of topics and different viewpoints, the exchange on how to further the field of quantitative analysis of science proved to be very fruitful.
Bibliometric indicators and computational approaches were critically discussed at a fine-grained level, including topics such as hidden biases, intentional gaming of indicators, and lack of reflection on different motivations for citation behavior. In the end, this line of discussion more than once concluded with the demand for more openness and transparency: transparency in research evaluation processes to avoid biases, transparency of algorithms that compute new scores behind the closed doors of commercial platform providers, openness of useful technology (e.g. from Semantic Web research, including named entity recognition or sentiment analyses) to be reused in new contexts.
Another recurring theme was the idea to involve broader (user) communities to ensure quality standards, e.g. to have a community that alerts platform providers such as ResearchGate or Mendeley of algorithmic mismatches and wrongly extracted metadata. Another idea was to enrich research assessment with expertise of the crowd.
One of the main conclusions was that metrics will gain in importance and that it is our responsibility as metrics researchers to take part in the ongoing discussion, and to remind other stakeholders of the lack of transparency and its potential consequences. Knowledge on parallel research efforts and approaches stemming from other disciplines is also critical since orchestrated endeavours would be more efficient and effective – given the diversity of fields and stakeholders that metrics are applied to. Another main conclusion was that more disciplinary and interdisciplinary research is needed to provide decision makers, funding agencies and politicians with the necessary insights to make informed decisions. Therefore, we want to continue the interdisciplinary exchange in further events and publication outlets. Also, watch out for further posts from the workshop – we hope to talk to you soon!
Note: This article gives the views of the author, and not the position of the LSE Impact blog, nor of the London School of Economics. Please review our Comments Policy if you have any concerns on posting a comment below.
About the Authors
Peter Kraker is a postdoctoral researcher at Know-Center of Graz University of Technology and a 2013/14 Panton Fellow. His main research interests are visualizations based on scholarly communication on the web, open science, and altmetrics. Peter is an open science advocate collaborating with the Open Knowledge Foundation and the Open Access Network Austria.
Katrin Weller is a postdoctoral researcher at GESIS Leibniz Institute for the Social Sciences, department of Computational Social Science. She studies social media users and uses in different contexts, including the use of social media for scholarly communication. Katrin was one of the inaugural Digital Studies Fellows at the Library of Congress’ John W. Kluge Center.
Isabella Peters is Professor of Web Science at ZBW Leibniz Information Centre for Economics and Kiel University. In her research she focuses on scholarly communication in social media environments as well as the evaluation of research outputs by incorporating user-generated content.
Elisabeth Lex is assistant professor at Graz University of Technology and she heads the Social Computing research area at Know-Center GmbH. In her research, she explores how digital traces humans leave behind on the Web can be exploited to model and shape the way people work, learn and interact. At Graz University of Technology, Elisabeth teaches Web Science as well as Science 2.0.
This is part of a series of pieces from the Quantifying and Analysing Scholarly Communication on the Web workshop. More from this series:
The ResearchGate Score: a good example of a bad metric
According to ResearchGate, the academic social networking site, their RG Score is “a new way to measure your scientific reputation”. With such high aims, Peter Kraker, Katy Jordan and Elisabeth Lex take a closer look at the opaque metric. By reverse engineering the score, they find that a significant weight is linked to ‘impact points’ – a similar metric to the widely discredited journal impact factor.Transparency in metrics is the only way scholarly measures can be put into context and the only way biases – which are inherent in all socially created metrics – can be uncovered.
We need informative metrics that will help, not hurt, the scientific endeavor – let’s work to make metrics better.
Rather than expecting people to stop utilizing metrics altogether, we would be better off focusing on making sure the metrics are effective and accurate, argues Brett Buttliere. By looking across a variety of indicators, supporting a centralised, interoperable metrics hub, and utilizing more theory in building metrics, scientists can better understand the diverse facets of research impact and research quality.
Context is everything: Making the case for more nuanced citation impact measures.
Access to more and more publication and citation data offers the potential for more powerful impact measures than traditional bibliometrics. Accounting for more of the context in the relationship between the citing and cited publications could provide more subtle and nuanced impact measurement. Ryan Whalen looks at the different ways that scientific content are related, and how these relationships could be explored further to improve measures of scientific impact.