Crowdsourcing offers researchers ready access to large numbers of participants, while enabling the processing of huge, unique datasets. However, the power of crowdsourcing raises several issues, including whether or not what initially emerged as a business practice can be transformed into a sound research method. Isabell Stamm and Lina Eklund argue that the complexities of managing large numbers of people mean crowdsourcing reduces participants to one faceless crowd. Applied to research, this is inherently problematic as it contradicts the basic idea that we control who participates in our studies. This not only challenges scientific rules of representativeness but also leaves methodological designs vulnerable to researchers’ implicit assumptions about the crowd.
This post is part of our digital methodologies series, arising from a stream at the ISA Logic and Methodologies conference, September 2016, organised by Mark Carrigan, Christian Bokhove, Sarah Lewthwaite and Richard Wiggins.
Social scientists are expanding the landscape of academic knowledge production by adopting online crowdsourcing techniques used by businesses to design, innovate, and produce. Researchers employ crowdsourcing for a number of tasks, such as taking pictures, writing text, recording stories, or digesting web-based data (tweets, posts, links, etc.). In an increasingly competitive academic climate, crowdsourcing offers researchers a cutting-edge tool for engaging with the public. Yet this socio-technical practice emerged as a business procedure rather than a research method and thus contains many hidden assumptions about the world which concretely affect the knowledge produced. With this comes a problematic reduction of research participants into a single, faceless crowd. This requires a critical assessment of crowdsourcing’s methodological assumptions.
Image credit: The crossing by Mark Gunn. This work is licensed under a CC BY 2.0 license.
In essence, crowdsourcing harnesses the time, energy, and talent of individuals – hereafter referred to as “crowd-taskers”. Crowdsourcing allows the involvement of a large number of participants and the processing of huge, unique datasets. As such, crowdsourcing is hyped as a key method of compiling and handling “big data”; able to be applied to perform interpretation, coding, and evaluation procedures. Researchers have written and illustrated books, coded masses of text data, and even created survey questions. In short, crowdsourcing has the potential for exciting new possibilities in knowledge production beyond the scope and scale of traditional research projects, whilst being useful in all stages of the research process, from design to write-up.
However, the power of crowdsourcing raises several issues. Some, such as the working conditions of crowd-taskers, are already being discussed. But others have received less attention, including the issue of quite how to transform this business practice into a sound research method; particularly the implications of crowdsourcing’s impact on how we interact with research participants, and the impact this has on the produced knowledge.
According to the alluring sales pitches of crowdsourcing platform providers such as Amazon Mechanical Turk, CrowdFlower or Zooniverse, researchers can engage with a seemingly unlimited workforce of knowledgeable, creative, globally-dispersed crowd-taskers. Crowdsourcing has been presented as an almost magical process: all the researcher has to do is input a task and – voilà! – enriched data comes back. Amazon Mechanical Turk promises you will “start receiving results in minutes”, while Workhub offered ways to “use the internet to get a year’s work done in a day”.
Crowdsourcing rhetoric draws heavily on tropes of efficiency, cost reduction, and the potential of technology to support vast networks of creativity. To describe the crowdsourcing process, CrowdFlower’s website uses the image of a launching rocket that returns with results; invoking speed, technological innovation, and a journey into the unknown. This rocket metaphor not only captures the efficiency trope that infuses crowdsourcing, but suggests a desirable opacity between researcher and contributors; they live on separate planets, too distant to have a clear picture of one another.
Indeed, the great draw of crowdsourcing is its ability to draw on large numbers of individuals. Because of the complexities of managing huge numbers of persons, crowdsourcing reduces them to one faceless crowd. Instead of having to deal with each individual member, a researcher’s interaction is with the crowd itself; this is the essence of what crowdsourcing allows. This complexity-reducing mechanism is seen as the great benefit of crowdsourcing for business, yet becomes inherently problematic when applied to research, as it contradicts the basic idea that we control who participates in our studies, either as part of our sample or as part of our team.
Researchers have tried to examine who the crowd-taskers are. Studies show the crowd neither consists entirely of amateurs nor digital experts, but is more homogenous and better educated than often pictured. Researchers further critique the often precarious working conditions of crowd-taskers. While these insights are valuable, in-depth knowledge about typical crowd-taskers does not resolve the issue of the faceless crowd in working with crowdsourcing as a method.
The “open call” nature of crowdsourcing means control over who participates is mostly beyond the researcher. The large number of participants precludes knowledge of each unique, situated individual, while at the same time the unknowable composition – and even the potential homogeneity of the crowd – challenge scientific rules of representativeness, thereby ruling out or greatly restricting the applicability of crowdsourcing for certain research questions.
With other kinds of research methods – just think about surveys vs. qualitative interviews – we have well-established ideas about the capabilities and composition of our research participants. In qualitative methods, we draw on constructivist ideas about the uniqueness and situatedness of each individual whose experiences and views of those experiences are in focus. Alternatively, quantitative methods build on positivist thought, focused on objective knowledge and random sampling, and how society, like nature, builds on absolute laws.
Crowdsourcing, however, is not tied to one set of assumptions about the crowd-taskers. Instead, the researcher’s implicit assumptions about the crowd drive the methodological design. This image steers the definition of the task, selection of a platform, incentives offered, and so on. The underlying and implied images a researcher has about the crowd is thus impactful and shapes the quality and validity of knowledge produced.
Thus the great advantage of crowdsourcing may be unproblematic for business, but raises methodological and ethical questions for academia.
When using crowdsourcing, it is the researcher’s responsibility to reflect upon the image of the crowd in order to achieve alignment between methodological assumptions, the research question, and the design of the crowdsourcing process. In current discussions on methodology, pragmatism has been on the rise as a frame of evaluation for what constitutes “good” research and how to think about research participants. In pragmatism, the focus is on reaching the most suitable procedure to answer a research question by constantly questioning, criticising, and improving what one is doing and why, in order to reach the most appropriate (note: not the most true) knowledge on which to act. This is similar to what has been called “reflexivity”.
As researchers and academic knowledge producers, we should not forget the parameters of knowledge production. We need to think about and reflect on the methodological underpinnings of new digital methods. To begin, we should reflect on who and what the “crowd” is and what this means for our particular study. To do so, we can draw on a pragmatist methodology that requires us to be candid about what we do and why, in relation to our end goal. We should remember that crowdsourcing stems from business and the structure of many commonly used platforms will shape our data.
With crowdsourcing come great opportunities, but also great responsibility.
This article gives the views of the authors, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.
About the authors
Isabell Stamm is heading a research group on entrepreneurial group dynamics funded by Volkswagen Foundation and located at the Department for Sociology at the Technical University in Berlin. In addition, her research interests include entrepreneurial families, work-family-intertwinement, and cooperative research methods.
Lina Eklund is a researcher at the Department of Sociology, Stockholm University, Sweden. She is the principal investigator of the Stockholm Internet Research Group, SIRG. In addition, her research interests include social life, interactions, and gender in relation to digital technologies.
Crowd funding is has less ethical issues in my opinion than in well funded and organised ‘think-tanks’ which have set agenda’s which are more and more focusued on promoting and justifying neo-liberal ‘values’ and which the media do not scrutinize enough.
Thanks fur using image, appreciate the credit.
With my co-author, I published a paper titled “CDS Rate Construction Methods and Machine”
https://papers.ssrn.com/soL3/papers.cfm?abstract_id=2967184 in May 2017; recently found that Mendeley has a paper wiith exactly the same title as mine and showing the paper has been published as early as 2015 in a journal. Apparently, the journal doesn’t have the publication.
I contacted Mendeley to correct the apparent mistake; citing they are Crowdsourcing and it’s users’ responsibility to correct their data, they refused to correct them. I contacted two respected authors, both academics; neither of them have used Mendeley before.
“You are correct because Mendeley Catalogue is a crowd source. We are simply not allowed to change the data of that article because this was published by someone else. The only way is to let that user remove the original paper or change it so it will be updated in the web catalogue.”