Reflecting on a recent interview with Sam Altman, the CEO of OpenAI, the company behind ChatGPT, Mohammad Hosseini, Lex Bouter and Kristi Holmes argue against a rapid and optimistic embrace of new technology in favour of a measured and evidence-based approach.
The rapid rise of ChatGPT deserves special credit for having mainstreamed large language models (LLMs). It took five days for ChatGPT to reach one million users. In comparison, it took Instagram 2.5 months, Facebook 10 months, and Twitter 24 months to hit that number. Upon widespread use of ChatGPT, however, there was initially panic in the education sector, but many educators and K12 teachers came up with ideas about how to integrate it into teaching. Sam Altman (Open AI’s CEO) recently acknowledged these efforts in an interview:
“We’re in a new world now and generated text is something we all need to adapt to, and that’s fine. We adapted to calculators and changed what we tested in math classes. This is a more extreme version, but the benefits are more extreme as well. We hear from teachers who are understandably very nervous about the impact of this on homework. We also hear from teachers who are like wow, this is an unbelievable personal tutor for each kid. And I think I have used it to learn things myself and found it much more compelling than other ways I’ve learned things in the past. Like I would much rather have ChatGPT teach me about something than go read a textbook.”
There is a lot to unpack about these views and in the rest of this commentary, we will challenge Altman’s assertions and will argue for using ChatGPT and similar applications in educational settings cautiously. While we do not advocate for banning LLMs, given the speed at which these applications have been introduced and employed, we think the education sector has not had enough time to reflect on their impact and biases and devise risk-mitigating measures.
the education sector has not had enough time to reflect on their impact and biases and devise risk-mitigating measures.
Issues related to the use of ChatGPT to support learning are complicated; while some may prefer ChatGPT over textbooks due to excitement about the technology or perhaps feeling a sense of empowerment, this might not be so for others, especially those who might be living in geographic regions less able to access technology or the Internet, by populations challenged by disabilities or digital literacy, or those who have a learning style that requires direct engagement for learning to be effective. Many of these challenges were observed in real-time and out-loud during the COVID19 pandemic, and continue to resonate across different regions of the world. Accordingly, ChatGPT or similar applications do not necessarily offer a more compelling experience for everyone. Furthermore, because ChatGPT is a new phenomenon, we neither have empirical data suggesting it is a better tool for learning, nor do we have any indication of their long-term impact on students with different abilities, interests or learning styles.
Altman’s calculator analogy compares two entities with remarkably different features. It’s like comparing an abacus and a quantum computer. While a calculator only solves mathematical equations, ChatGPT is connected to sources of knowledge and can be a “personal tutor for each kid” as Altman suggested. Although, dealing with social challenges and technological shifts in education is not a new phenomenon, with LLMs, the problem is more complex because they are built on an obscure selection of undisclosed sources. ChatGPT’s current version has been trained on data up to 2021, but we do not know where this data came from. From an ethical standpoint, two perspectives, namely the misattribution of credit and hidden biases are noteworthy.
As Noam Chomsky claimed: “ChatGPT is basically high-tech plagiarism … It’s a system that accesses astronomical amounts of data and finds regularities, strings them together, looks more-or-less like what somebody could have written on this topic. Basically plagiarism, just happens to be high-tech.”
Figure 1. By creating a tapestry of biases, high-tech plagiarism disguises original biases. This image is reproduced with permission of the designers, Farzaneh Hosseini and Mahdi Fatehi.
The notion of stringing data together as Chomsky put it, has been challenging scholars for centuries. However, since ChatGPT text is not verbatim plagiarized, it is more similar to patch-writing, copying the structure of a text, or rewording old ideas, which are among subtler forms of plagiarism that obscure used sources.
since ChatGPT text is not verbatim plagiarized, it is more similar to patch-writing, copying the structure of a text, or rewording old ideas, which are among subtler forms of plagiarism that obscure used sources
Regarding hidden biases, when texts that string plagiarized data together become a reference, they disguise biases of the original sources. In other words, by creating a tapestry of biases, high-tech plagiarism disguises original biases. The history of science and education shows that these obscured biases may serve hidden purposes like strengthening or perpetuating a view or conviction. For example, to understand biases in the Mercator projection (a cylindrical map projection of the world, which continues to impact our understanding of the world), it is vital to know that its developer, Gerardus Mercator lived in 16th century Europe, during a time of intense exploration and colonialism. One reason why the Mercator projection helped reinforce and prolong colonial biases was because it became a dominant resource for geographical and cartographical education. Just like the Mercator projection was heavily influenced by a Eurocentric perspective of the period in which it was developed, ChatGPT is developed in the anglophone world and trained in English, thereby it is prone to the biases of these communities and their worldviews.
To conclude, we believe that ChatGPT might support education, but we have no clear SWOT (Strengths, Weaknesses, Opportunities and Threats) analysis yet. Many questions can be raised, most of which should be answered empirically. Examples include:
- How can ChatGPT affect education of different cohorts of students with different abilities?
- Does ChatGPT reinforce biases in education, and, if so, what can be done to minimize these?
- How can undisclosed use of ChatGPT by students or trainees be detected?
- To what extent does ChatGPT generate valid and replicable results in summarizing the body of knowledge for educational purposes?
We believe that more research in this area is urgently needed. Recent cases where technology has been introduced to the education sector without due diligence (e.g., InBloom, Knewton) shows why rigorous assessments and employing ethical/regulatory frameworks (often applied more rigorously in the healthcare sector) prior to using technology in classrooms could help mitigate risks. For this reason, guidelines about the use of LLM-based applications in classrooms are sorely needed. Given the potential for bias and harm, these efforts need to be built upon a strong foundation of respectful practices, developed intentionally to integrate equity, diversity, and inclusion throughout. Responsible public institutions and funding agencies should be commissioning and funding this work now before biases embedded in these systems are instilled in classrooms.
The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.
Image Credit: Reproduced with permission of the designers, Farzaneh Hosseini and Mahdi Fatehi.
language is sermonic. Generative AI is not exception to it..