From multi-stakeholder platforms like ORCID, to commercial services like Google Scholar, academic profiles exist in a complex landscape of information flows. Lambert Heller provides an overview of the available scholarly profile pages and offers insight into their future development, which is set to be shaped by business models, technology, and available data streams.
We’re used to easily finding researchers’ profile pages on the web, often finding more than one for the same researcher. Especially for the new generation of researchers, accustomed to having every piece of their research online, these individual profiles make all kinds of sense: They make it easy to show one’s own efforts, maintain and present a network of related researchers, and reach out to collaborators and potential employers.
These pages are relatives of “professional networks” such as LinkedIn – but they offer many information elements and information flows that are specific to those working in the academy and scientific research. However, these profile pages go far beyond the digital version of a traditional CV (“curriculum vitae”). A workshop within the Digital Humanities Experiments event on 11/12 June 2015 at the German Historical Institute Paris (DHIP), led by mathematician David Chavalarias and me, explored tomorrow’s networked researchers’ profile pages. (Learn more about the workshop and its outcomes in my blog series around the event.)
We are not anywhere near a situation where one provider of scholarly profile pages makes all the others unnecessary. This is due to the fact that we have a complex landscape of information flows with a number of totally different information hubs.
It’s about the data, stupid! – Why metadata availability largely defines the three major business models of scholarly profiles
On the one hand, we have contributor registries like ORCID, which is operated by a far-reaching multi-stakeholder coalition. They don`t even try to deliver one “full service package”, but they want to provide a sustainable, agreed-upon source of author data available to all kinds of services. Then we have commercial services. Bundling all kinds of services on their own platform, these services are most often islands in regard to their preferred information flow. You fill out your profile, maybe even upload your papers, but in most cases you will have a hard time to let any other service reuse the data once you left it there. Third, we have institutional players, and some non-commercial players that are essentially financed by academic funding organizations. They often draw from publicly available information streams, like the before-mentioned ORCID – but if and how the resulting profile pages are fit for consumption on the public web depends heavily on the goals and means of the institution running the services.
Let’s have a closer look at some of the major players in each of these categories. Or have a look at the table, if you’re in hurry…
[1] Often supplemented by institution
[2] Often supplemented by institution
[3] Via ORCID
What do networked researchers’ profile pages include? Or, one ORCID iD to rule them all.
When we talk of modern approaches to the issue of networked profiles, we have to mention ORCID. ORCID is a relatively new initiative driven by some of the largest non-profit and commercial academic publishers, national libraries, professional societies and major Open Access repositories. Their goal is to build a centralised registry of all “researchers and contributors” to academic products, allowing for unique identifiers that remove ambiguity regarding the identification of their contributions. As an example, take a look at the web representation of ORCID iD 0000-0001-5109-3700.
What does ORCID hope to achieve? First, all publishing and archiving outlets will sooner or later be able to identify all authors and contributors by their ID; second, institutions and individuals can populate their own profiles with the ORCID data collected about them, synchronising and updating between their ORCID profiles and any other profiles they may have elsewhere. But is there any need for other profiles if you can have everything in one place ‒ ie your ORCID profile? Let’s explore this in greater detail …
Information elements:
- Scholarly products (articles for journal and other publications)
- Self-assigned keywords
- Researchers’ alternative names (to ensure disambiguation)
- Identities in other systems, profiles on other services
- Attribution of multiple institutions (education, former employers, etc.)
- Attribution of grants/third party funding
Reuse factor (structured availability and reuse rights):
- High
Kings of convenience: the rise of commercial siloed academic networks
While ORCID may be new to some readers, nearly everybody within or in the vicinity of the academic environment is now familiar with “Facebook for scientists” services such as ResearchGate. This type of service started gaining ground around 2008 – the leaders in the field being ResearchGate, academia.edu and Mendeley, with user counts allegedly in the millions. (For further analysis, cf.Nentwich and König 2014. Example of profile pages: ResearchGate, academia.edu, Mendeley.) One reason for their success must be the convenience they offer, enabling anyone to present their academic efforts in one place – a convenience that sometimes develops into rather aggressive urging of users to update their profile for better discoverability. A prime example of the strange outgrowth of this kind of service is the “ResearchGate score”, a self-acclaimed new measure for scholarly impact, an indicator based solely on activities occurring on this service’s website, possibly one of the purest offerings to scholarly vanity imaginable.
What all these Facebook-mimicking services have in common is that all of the information entered in the database of these services, from simple facts about a researcher’s work to whole papers that can be self-archived directly into these services, is owned solely by the commercial enterprises behind them. In this way, these services exemplify the “web 2.0” principle of being free (as in free beer), with the caveat that you cede control over your aggregated profile data. This is not only a matter of data-freedom principles. If you try to harvest large chunks of content from these databases for reuse elsewhere (as undertaken regularly by Google and other search engines), you soon learn that this is not permitted. Only Mendeley earns a special mention for being a kind of exception in this regard –it offers much of its data under reuse conditions.
Most common information elements:
- Scholarly products (articles for journal and other publications)
- Self-assigned keywords
- Attribution of multiple institutions (education, former employers, etc.)
- Personal profile photo
- Social graph (type of follower relation, in some services co-authorship)
- Attention metadata from the platform itself (views, downloads, bookmarks, etc.)
Reuse factor (structured availability and reuse rights):
- Low to non-existent (most academic networks)
- High (Mendeley)
Authentic researcher profiles that are (almost) never meant for the public web: siloed institutional “current research information systems” (CRIS)
Although information systems such as ResearchGate tend to be very popular at present, and can by all means shed light on what scholars truly want ™, they have at least one enduring problem: they are never complete. However, if you define scholarship as being attached to a certain university or other research institution, you may find “current research information systems” (CRIS) to be a possible new contender for acting as a valuable source of information about researchers and their activities. And a complete one at that, at least with regard to the institution running the respective CRIS.
What are CRIS all about? Mainly acquired by large academic publishers in recent years, contenders such as Thomson Reuters Converis, Elsevier Pure and Symplectic Elements offer CRIS database products. Research institutions run CRIS to pool data about their staff and research facilities. From a research controlling perspective, this is useful for understanding and reinforcing an institution’s assets. Although most of these systems are, technically, online databases, only a few institutions view this as an opportunity to raise public awareness of their research activities. In many cases, databases are completely hidden from public view. In contrast to “Facebook for scientists” services of the ResearchGate kind, with CRIS we have no problems with completeness and re-usage rights, but with the public availability of the data in the first place. That said, there are a number of positive exceptions: as mentioned in an earlier blog post, VIVO aims to be a research information system based on the original means of the web (like semantic ontologies), while delivering information from some universities to the whole open web, usually including comprehensive re-usage rights.
Most common information elements:
- Scholarly products (articles for journal and other publications)
- Detailed attribution of institutional roles and positions
- Self-assigned keywords
- Concepts from controlled vocabularies and/or automatically generated profiles
- Personal profile photo
- Social graph (co-authorship)
- Attribution of grants/third party funding
Reuse factor (structured availability and reuse rights):
- Low to non-existent (most CRIS implementations)
- High (VIVO, a number of other CRIS implementations)
Impact and other ways to tell a scholar’s story: other approaches to researcher profile pages
Another very well-known type of researcher profile pages is delivered by Google’s academic search engine “Scholar”. (According to preliminary results of a very interesting survey from Utrecht University library, GS profiles are even more popular than those on ResearchGate, let alone ORCID, academia.edu or institutional CRIS.) Google Scholar is more or less comparable with huge traditional science citation indexes such as Web of Science (or WoS for short, now owned by Thomson Reuters) and its rival, Elsevier Scopus. Google radicalised competition between these huge cross-disciplinary corpora of scientific article and citation metadata: while WoS and Scopus covered a limited set of peer-reviewed academic journals, placing them all in an online database licensed by university libraries, Google Scholar takes a full-text search engine approach, undeniably covering more documents and delivering search results, including citation counts, to end users for free.
In 2011, Google Scholar launched profiles, something that cannot be found in WoS or Scopus. The idea is not only to give searchers a comprehensive view of individual researchers, their articles and citation counts, but also to enable them to add to their profiles themselves. Unlike ResearchGate, the service does not aim to be a “full service package”. Instead of inviting researchers to self-archive their papers on the actual website, it covers self-archived versions from services such as ResearchGate as well as from traditional institutional Open Access repositories. The only data that Google Scholars may automatically give to third party services is the citation count of each document.
ImpactStory offers a service that is comparable in many ways to that of Google Scholar profiles. However, it follows a very different business model. (Example of an ImpactStory profile page.) While Google Scholar is a commercial service that searchers and profile owners can use free of charge, ImpactStory is a largely third party-funded non-profit organisation seeking to become sustainable through services paid for by profile owners. While Google Scholar draws its data from its own article and citation index, ImpactStory remains sleek by drawing from many different sources of citation data and impact metadata – from Facebook ‘likes’ to the number of forks on Github – or so-called “altmetrics”. The idea is to operate as a service for collecting and consolidating this data, and to present it on behalf of profile owners.
ImpactStory is by no means the only service that aspires to be the clearing point for this kind of data – compare, for example, Plum or Altmetric.com. In the growing landscape of citation and attention metadata, many publishers, repositories and institutional research information services have already decided not to collect impact metadata themselves, but to draw from one of these services. It is interesting to note that ImpactStory was one of the first services of its kind to offer the automated import of ORCID data. To conclude: although they appear to be similar at first glance, Google Scholar profiles are a strongly shielded island, whereas ImpactStory strives to be a useful intersection for different services and data streams.
Common information elements:
- Scholarly products (articles for journal and other publications)
- Self-assigned keywords
- Personal profile photo
- Social graph of co-authorship (Google Scholar)
- Social graph (type of follower relation, in some services co-authorship)
- Citation generated on the platform itself (Google Scholar)
- Citation and other impact data from different platforms (ImpactStory)
Reuse factor (structured availability and reuse rights):
- Low to non-existent (Google Scholar)
- High (ImpactStory)
Some conclusions
With the growing expectations of cultivating one’s own scholarship profile online completely and conveniently, things have become more interesting, and sometimes confusing. The whole area still seems to be in its infancy. A strong indicator of the ongoing development of this ecosystem is the consolidation of freely available metadata streams – besides ORCID, we now have CrossRef’s DOI event tracker pilot as a free source of impact metadata across many scholarly articles. In the area of institutional research information systems, open approaches such as VIVO ontologies and software are constantly gaining greater traction, enabling custom developments and experimentation. So, interesting times ahead!
Disclaimer: On behalf of my employer, TIB Hannover, I work with the DOI event tracker working group and the TIB Open Science Lab runs experiments and development with VIVO ontologies and software.
This topic was covered at a workshop within the Digital Humanities Experiments event on 11/12 June 2015 at the German Historical Institute Paris (DHIP), led by mathematician David Chavalarias and me. This is an edited extract of a piece which originally appeared here and is the second part (read part 1) of a contribution to DHIP’s blog carnival accompanying the whole event.
Note: This article gives the views of the authors, and not the position of the Impact of Social Science blog, nor of the London School of Economics. Please review our Comments Policy if you have any concerns on posting a comment below.
Lambert Heller serves currently as the head of Open Science Lab at TIB Hannover (German national library of science and technology). As an academic librarian with a background in humanities and social sciences, he tries to find useful new things in the area of scholarly communication, and he writes and teaches sometimes about that. He tweets as @Lambo.
Both WoS and Scopus have profiles.
Really good overview of the main academic profiles services that exist today. I basically agree with everything that is said here.
However, I think there is an issue that deserves a more thorough discussion: the features each service offers to keep the profiles up to date. In my experience, academics often can’t be bothered to update their own profile regularly. This means that many profiles are effectively useless. For example: I’m sure there are thousands of empty or near empty ORCID profiles: an uuser taht created an account and added a few documents to see how it worked, but he/she quickly got bored or tired, left, and never came back. Outdated profiles also make comparisons among various profiles very difficult. Therefore, I might have added another row to your table to describe how each of these products approach this issue
Of course, all services allow users to add documents manually (the problem is no one wants to do that). ORCID, for example, also allows users to import them from other services (Thomson Reuters’ ResearcherID, BibTeX files…). In ResearchGate, when an author uploads a new paper, that paper is also added to the rest of the coauthors’ profiles (if they have one). Impactstory automatically imports new elements added to figshare, github, ORCID… but it requires user interaction to import documents from Google Scholar (the restriction is imposed by Google, of course). My opinion is that even importing batchs of records from other services is too much work, or too troublesome for the average academic. I have heard about some universities that have created teams who set up and mantain ORCID profiles for all their researchers, but I don’t think that’s something most universities and research centers will be able to do.
To the best of my knowledge, the only service that automatically adds new academic documents to profiles without any kind of user interaction, is Google Scholar Citations. Since Google Scholar is constantly parsing the academic web in search of new documents, GSC will automatically add a document to a profile just a few days after is published online in an academic journal, an academic repository, etc. The system also lets users decide whether they want GSC to automatically add documents to the profile, or to send them a confirmation e-mail first (so the user just has to say: “yes, I wrote this”). Of course, this approach has introduced another kind of problem: there are users who create a GSC account in full automatic mode, and then leave it unsupervised for a long time. In these cases it is common to documents that weren’t actually written by the user who created the profile. This usually happens to users with common names.
So, to sum up, I believe one of the greatest challenges of academic profiles is how they are kept up to date. Academics want something that works, but that at the same time doesn’t require work from them. I think that is maybe the reason why GSC profiles are becoming so popular, even if they don’t offer as many fancy features as ORCID, ResearchGate or ImpactStory. On the other hand, I believe the aggresive notification policies that ResearchGate and academia.edu currently set by default are a mistake, because academics won’t probably stand them for long (again, changing the default notification settings is not something you can realistically expect most people to do).
I think this has already become too long a response. Thank you for your attention if you got to the end 🙂
I got there Alberto and heartily agree with your points about keeping profiles up to date! I did used to try to encourage my academics to stick to just one product (either research gate or academia or google scholar etc). But that’s hard when fashions and advice from colleagues and institutions changes. The question I frequently get asked us which product is best? Which one should I invest my precious time in? I like the idea of Libraries being involved in keeping profiles up to date – a natural area of expertise for us. But are we logistically ready for that – yes if we decide it is a priority and let other areas go that perhaps are not relevant anymore.
Alberto and Sian, thank you both for your comments!
I think the point about getting librarians involved in keeping profiles up-to-date is indeed interesting. Current Research Information Systems (CRIS) normally work that way. As they are usually an institutional offering, most of them have a sophisticated workflow, where librarians check publication lists (in regard to consistency, completeness etc.), while reseachers retain full control over what may finally be published to the outside world.
Recently ORCID introduced a very similar concept: You can delegate individuals or even organizations to keep your profile updated, which makes totally sense – as a typical ORCID usage scenario includes using your ORCID record as a trusted source e.g. to keep your institutional profile up-to-date, and vice versa.
This leads me to your praise of Google Scholar et al. in regard to the level of automation. I think this is not entirely true. The special thing about ORCID is, that publishers, services like PubMed Central, large repositories etc. can (in the near future surely will) signal your attribution as an author or contributor exactly when your publication takes place! Services like ImpactStory, and very often CRISes and VIVO, will make use of the updated data in your profile. This is the way ORCID is supposed to work on the long run.
In comparison, services like Google Scholar or ResearchGate are islands. They don’t make use of ORCID (at least as far as we know today), but harvest publishers’ websites, PubMed etc. each on their own. This has the downside that matches are often late and not of the best quality possible. (Think name changes or similar names, which was a reason to build ORCID in the first place.) So, while Google Scholar and ResearchGate may look like (and possibly are) a pretty good choice at the moment, you have to be aware of that 1. They won’t allow you to reuse your personal networks’ data (or any other huge chunk of data) in any other service, if you change your mind one day; 2. they keep you separated from the infrastructure of reusable information flows like ORCID, that are sometimes even mandated by funding organizations as the Wellcome trust.
My take as a librarian: I tend to advice young scholars to see Google Scholar and ResearchGate as additional “amplifiers” that they can use to spread the word about their research outcomes even wider – if they have the time to do so. But ORCID is a far better candidate not just as one of many “profile page” providers, but as an information hub that will directly link the most important publishing outlets, research information systems, library catalogs etc. to each other. As a researcher, I’d expect from my research institution, that they provide me with a complete ORCID record – at least on the long run.