There are many valid concerns about AI’s use of copyrighted material, writes Lance Eaton. But there is no merit in rejecting AI simply to re-legitimise the same inequities in the enforcement of copyright that led to piracy in the first place.
A recent analysis by Alex Reisner shows that major AI companies have sourced training data from platforms like LibGen (Library Genesis) that contain copyrighted material. The suggestion is they have done this knowingly, perhaps reasoning that it is better to simply beg for forgiveness later rather than ask for permission. Reisner has also published a searchable database of copyrighted content on LibGen to help identify what has been used.
The anger that has since been directed toward AI companies is understandable. Many authors took to social media to decry the extent to which their work is being used to train AI tools. But the issue also raises some difficult questions about copyright. Chief among them is whether we should be concerned about using material in this way if it ultimately makes information more accessible.
People vs profits
Reisner argues that while “LibGen and other such pirated libraries make information more accessible”, AI companies go further as “their goal is to absorb the work into profitable technology products that compete with the originals”. In essence, it is acceptable for people to benefit from resources like LibGen, but not companies.
The problem with this argument is that both individuals and companies are essentially doing the same thing: benefitting from illegally acquired copyrighted material. In both cases, they have decided that illegal means are more affordable and accessible than legal ones. They are both “ingesting” content that can be used for future creative, experiential or monetary purposes.
It is easier to criticise faceless companies than the countless individuals who illegally download millions of titles each month. Yet in essence, they are both using the work of authors without offering compensation.
It is easier to criticise faceless companies than the countless individuals who illegally download millions of titles each month. Yet in essence, they are both using the work of authors without offering compensation. Indeed, the recent revelations have similarities to the outrage last year about “academic fracking”, where publishers sell access to material to AI companies. They represent yet another example of an industry built on exploiting the work of authors.
While Reisner states that society is still grappling with “how to manage the flow of knowledge and creative work in a way that benefits society most”, the truth is that this has already been decided. The “flow” goes to those who can control, afford, or steal material and very rarely do society’s interests enter the equation.
Copyright as a broken system
Copyright law fails to resolve these issues in a sustainable way. This is unsurprising as copyright law privileges intellectual property conglomerates rather than individual creators. Rebecca Giblin and Cory Doctorow’s Chokepoint Capitalism demonstrates how most creative industries are owned by a handful of companies, creating oligarchic control that limits creative expression and access. This is a feature of copyright law, not a bug.
Copyright reflects the tension between what the creator owes the culture and what the culture owes the creator.
Copyright reflects the tension between what the creator owes the culture and what the culture owes the creator. Creative work is only made legible by a culture because the creator has borrowed from that culture. The aim of copyright is to balance the interests and contributions of the creator and culture by giving the creator a limited period to profit from their work. Previously creators were also allowed to renew copyright before their work enters the public domain. Today, those works do not enter the public domain until 70 years after the death of the author unless otherwise stipulated.
My dissertation on academic piracy sheds some light on how copyright has negatively impacted research and scholars. I explored how scholars make sense of their use of academic pirate networks like SciHub and LibGen in relation to their scholarly identity. Many scholars indicated these platforms were important and even necessary to their work, allowing them to stay sufficiently up-to-date and engaged in their discipline to produce knowledge.
Illegal though they are, resources like LibGen exist because our system for accessing knowledge is fundamentally broken.
This suggests that many scholars in the 21st century feel they need to resort to accessing illegal platforms because traditional methods are insufficient or unavailable. It underlines that far too much research is locked behind paywalls and that scholars at all levels find it difficult to access the resources they need to do their work effectively. Illegal though they are, resources like LibGen exist because our system for accessing knowledge is fundamentally broken. These resources exist in opposition to a power structure that has created and profited greatly from copyright, while not returning work to the public in a reasonable and timely manner.
Difficult questions
What AI companies are accused of is something that intellectual property right holders do regularly – bypass copyright when it suits them while enforcing it on individuals. This is what started last year when publishers began selling the work of authors for AI training. AI companies may end up paying a fine, but individual authors are unlikely to see anything, and if they do, it will be a pittance. This calculation was likely already made when these companies initially decided to pirate content.
Audrey Watters invokes morality when trying to make sense of these developments, asking: “How can anyone cultivate a moral relationship [in students] to creative and intellectual work – their own and others’ – if we’re building it atop (or rather, pushing a button to autogenerate from) a technology of deception and theft? An immoral technology.”
There is no merit in rejecting AI simply to re-legitimise the same inequities that led us to piracy in the first place.
This is a question we should be asking. However, we should also be asking about the immorality of copyright as it stands today and its history of individual and cultural exploitation. This is where things get difficult in relation to AI. There is much thoughtful and critical concern about AI that is valid. But if we’re going to try to change things for the better, we can’t pretend it was working before. There is no merit in rejecting AI simply to re-legitimise the same inequities that led us to piracy in the first place.
The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.
Image Credit: AI image generator on Shutterstock.
In a word, specious. If the tech bros were to pay for content to veil the audacity of their actions, this is what it would look like.