LSE - Small Logo
LSE - Small Logo

Harry Collins

May 9th, 2019

Death of the author? AI generated books and the production of scientific knowledge

4 comments | 8 shares

Estimated reading time: 6 minutes

Harry Collins

May 9th, 2019

Death of the author? AI generated books and the production of scientific knowledge

4 comments | 8 shares

Estimated reading time: 6 minutes

Artificial Intelligence (AI) has been applied to an increasing number of creative tasks from the composition of music, to painting and more recently the creation of academic texts. Reflecting on this development Harry Collins, considers how we might understand AI in the context of academic writing and warns that we should not confuse the work of algorithms with tacit complex socially constructed forms of knowledge.


Apparently there are now academic books generated by artificial intelligence algorithms.  An example just published by Springer Nature, and written by ‘Beta Writer’, is called Lithium-Ion Batteries: A Machine-Generated Summary of Current Research.  I don’t know anything much about Lithium-Ion batteries, nor about how these algorithms work, but I do know something about scientific knowledge and the way it is generated. I have also written three books (without the aid of an algorithm) on artificial intelligence that draw on this knowledge, most recently: Artifictional Intelligence, Against humanity’s surrender to computers.

I’ll try to use this knowledge to say something general about the relationship between AI-generated books as I understand them and conventional books and academic publications. But to start off with, note that ‘the problems of AI’ are easily circumvented, when you set the standard of the human performance that is to be reproduced too low. Humans are so capable that they can always choose to act as though they were poorly functioning machines, inevitably there are also less capable humans who act this way without choosing.

For all I know there are people out there writing academic books by doing no more than summarising the content of lots of existing papers – creating a kind of low-level literature review, maybe dignified with the name of a ‘systematic review’ – and if there are such people, I see no reason why AI-generated books should not do just as well or better. Lithium-Ion Batteries may or may not be this kind of book, but I am just saying that there is nothing of interest to discuss here, if the standard to be met by the AI is not a high one. The question of what kind of human expertise we need to mimic, if we are to ask interesting questions about AI is more complex and discussed at greater length in Chapter 10 of my recent book mentioned above.

But let’s start from the other end – normal scientific publications. As part of my work as a sociologist of scientific knowledge, I spent about 45 years hanging around with scientists trying to detect gravitational waves; I started in 1972 and kept at it until after the momentous discovery was announced to the world in 2016. I have written extensively on this topic from its early beginnings, when many thought the whole business was an expensive fools’ errand (Gravity’s Shadow), up to the detection and ‘discovery’ of these waves (Gravity’s Kiss). A telling anecdote on the nature and significance of academic writing occurred during this period, in 1996.

The pioneer of gravitational wave detection was Joseph (Joe) Weber who, in the late 1960s began to claim that he was seeing the waves with his relatively inexpensive apparatus and he published a number of papers to that effect. For a while, Weber was one of the most famous scientists in the world, but by about 1975 few people believed his results. In the meantime, other kinds of more sensitive detection apparatus were being developed, culminating in the hugely expensive, kilometre-length, interferometers that would make the detection in about another 40 years.

However, in 1996 Weber published a paper in a physics journal explaining that he had discovered that his early detections were probably correct after all, since he had re-analysed them and found that they correlated (statistical significance 3.6 standard deviations), with another set of events, which were a likely source of gravitational waves: cosmic gamma-ray bursts. If this finding was true it was of enormous significance – Nobel Prize kind of significance – and would transform the future direction of gravitational wave detection.

Naturally, I contacted all my friends and acquaintances in gravitational wave detection physics and asked them what they thought about the paper. To my astonishment I found that no-one had read it!  I pressed my inquiry on for several weeks but I couldn’t find anyone; some people initially said they had read it but a bit of probing revealed that they were thinking of some other paper. What had happened is that by 1996 Joe Weber’s credibility among the community of gravitational wave detection physicists, in spite of his being the founder of the field, had fallen so low that his paper made zero impact on the science. In effect (literally), in spite of appearances, it wasn’t a scientific paper at all.

The moral is, that a publication may have the appearance of a regular scientific paper, but contain no information; in spite of appearances it is not really a scientific paper at all. Of course, we should already know this from the ‘replication crisis’ – we now know that papers justified at the 2-sigma statistical level are pretty free of information; and we should know it from the logarithmic nature of citations – that the large majority of published papers are never cited and probably never read. In other words, only a small subset of what is in the scientific literature is science, the rest is professional activity with the superficial appearance of science.  If there are extra-terrestrial aliens spying on earthly science by hacking into our libraries, they are going to get it all wrong! Machine reading of research literature faces the same problem.

Science is in reality deeply social and scientists learn to trust certain claims and findings and reject the rest through interaction in smallish groups through processes that are largely reliant on trust and the transfer of tacit knowledge. For these reasons it seems to me that books produced by trawling the literature with algorithms are only likely to be useful for a severely restricted set of not very interesting activities. Furthermore, they have the potential to be enormously misleading and dangerous, if mistaken for any more innovative kind of science.

 


Note: This article gives the views of the author, and not the position of the LSE Impact Blog, nor of the London School of Economics. Please review our comments policy if you have any concerns on posting a comment below.

Image Credit: Jason Leung via Unsplash (Licensed under a CC0 1.0 licence).


Print Friendly, PDF & Email

About the author

Harry Collins

Harry Collins is Distinguished Research Professor at Cardiff University, Fellow of the British Academy and Bernal Prize awardee.  His 25 books are mostly in the sociology of scientific knowledge.

Posted In: Academic communication | Academic publishing | Academic writing | AI Data and Society

4 Comments