AI-generated output has the potential to be much more than answers if it includes a summary of relevant sources, indicators of disagreement between them and a confidence score based on source credibility. Jakub Drábik calls this epistemic responsibility. He writes that AI tools don’t always have to be right, but they must be more transparent, explaining why they think they are right and what might make them wrong.
Large language models (LLMs) such as ChatGPT have rapidly become embedded in research, education, journalism and public policy. Their outputs are often astonishing: coherent, fluent and fast. Yet even as their capabilities grow, a foundational issue remains unresolved. These systems do not know what they are saying. They generate plausible responses, but cannot distinguish between verified facts, theoretical interpretations, contested claims or outright falsehoods.
As a historian working with textual sources, I am trained to ask not only what a statement claims, but where it comes from, why it exists and how it can be evaluated. In recent months, I have found myself turning to LLMs to assist in my work—only to encounter answers that sound right but cannot be traced, supported or interrogated. When I ask for references, I am given hallucinations. When I ask for balance, I am given rhetoric. When I ask for method, I get metaphor. No matter how careful the prompt, the system has no true way to check its own statements.
This is not a matter of prompting technique. It is structural. Today’s AI systems are trained on vast amounts of language, but not on knowledge in the epistemic sense: grounded, verifiable, source-aware and accountable. What results is a surface of plausibility without a spine of justification.
From language to justification
What might that spine look like? The idea of an epistemic infrastructure layer for LLMs has increasingly gained traction among scholars in AI and philosophy of science. Such a layer would not replace language models but would support them. It would enable models to reference structured sources, represent uncertainty and surface disagreement. It would make claims traceable, their epistemic status visible, and their origins accountable. I do not claim to have the technical blueprint for this infrastructure. But I believe we urgently need to create the conditions in which it can be developed.
Several efforts already point in this direction—from retrieval-augmented generation (RAG) and academic citation tools to uncertainty quantification models and memory-augmented networks. Recent research, such as this work on fusion-in-decoder or the work on the RETRO model, has explored ways of integrating retrieval into language models to improve factual accuracy. Yet these efforts, while valuable, often focus on improving performance rather than rethinking epistemic foundations. Conversations with researchers have introduced me to promising prototypes like the Neural k-Nearest Neighbours system (NN-kNN), and DeepMind’s RETRO. But these remain fragmented, domain-specific, and rarely theorised in epistemological terms. What’s missing is a coordinated approach that treats epistemic integrity as a primary design goal, not just an optimisation strategy or afterthought.
Imagine a system in which an AI-generated answer includes not just a citation, but a traceable reasoning chain: a summary of relevant sources, indicators of disagreement between them, and a confidence score based on source credibility. Unlike current retrieval-augmented generation implementations that primarily retrieve supporting passages, this would offer a richer epistemic context, revealing where sources align or conflict and allowing users to interrogate the structure of the response itself. Such a framework could support users across domains: helping researchers verify claims, enabling journalists to contextualise data responsibly and assisting educators in demonstrating how knowledge is constructed, contested and revised over time.
What is truth?
This requires a deeper reflection on what we mean by knowledge in the first place. Friedrich Nietzsche famously argued that truth is “a mobile army of metaphors.” While that may sound radical, it captures something essential about the provisional nature of human understanding. Different domains of knowledge have different standards of evidence and no system can claim universal authority. But what we can ask for—what we must ask for—is transparency: systems that not only show their sources but make visible the reasoning paths, the contestations among sources, and the frameworks used to assess credibility. This is not a neutral process. It involves choices about what counts as reliable, and these choices must themselves be open to scrutiny. AI will never escape the human condition of partial knowledge, but it can be built to reflect it honestly.
We do not need systems that are always right. But we do need systems that can explain why they think they are right and what might make them wrong. That is the beginning of epistemic responsibility. And in the long run, it may be the only kind of AI we can truly trust. For those of us who have studied how misinformation, half-truths and manufactured narratives have shaped the darkest chapters of modern history, this is not just a technical issue. It is a question of whether we allow machines to reproduce the errors of our past, this time at scale, and with the illusion of objectivity. Call it epistemic AI (which, incidentally, gives us the rather dramatic abbreviation EPIC AI), or civilisational infrastructure. Either way, the time to begin is now.
Sign up for our weekly newsletter here.
- This blog post represents the views of its author(s), not the position of LSE Business Review or the London School of Economics and Political Science.
- Featured image provided by Shutterstock.
- When you leave a comment, you’re agreeing to our Comment Policy.
‘But we do need systems that can explain why they think they are right and what might make them wrong.’ It seems like such a straightforward solution. The absence would certainly point to the likely intent to manipulate.