BERA's counter-attack is paper-thin and sanctimonious: my methods are perfectly sound
BERA is claiming that it is 'methodologically unsound and ethically problematic' to take a preponderance of post-2023 phrases is evidence of LLM-generated text. They are, again, completely wrong.
One of my main methods for finding AI-authored articles has been to use a Google Scholar search for the use of specific phrases found in articles both pre- and post-2023. If an article contains at least 15 phrases that were never, or seldom, used pre-2023 but have since come into use or has become significantly more popular, I take this as an indication that an article contains significant text generated by LLMs.
BERA have suggested this is both ‘methodologically unsound and ethically problematic’, for the following reasons:
However, disclosure is not required for purposes such as language polishing, conciseness, formatting and compliance, which can be of great use to many authors, including those who speak English as an additional language.
Interpreting features common in global academic writing as evidence of misconduct or AI generation risks implying that any deviation from Anglophone academic norms is inherently suspect, an assumption that is both methodologically unsound and ethically problematic.
Essentially, in a very academic and rather sanctimonious way, they are saying that the phrases I have found could well result from having an LLM polish the language or make it more concise, and that it is wrong to assume otherwise.
Fortunately, it is possible to interrogate the phrases (to use an irritating academic term) to shed light on which interpretation is correct: are the phrases indicative of LLM-generated text (not allowed) or are they indicative of LLM polishing and summarising existing author-written text (allowed). I believe the evidence is clear: time and again the phrases I have flagged communicate specific and complex ideas, which mean it is very unlikely that they are the result of text polishing or summarising.
Take the following example from a BERA article: the phrase “Second, the reliance on self-reported measures may introduce” was never used before 2023, but since Google Scholar shows 74 results. The phrase is a very specific one, indicating that:
it follows from a first idea
AND
it is discussing a certain type of research that leans on self-reported measures
AND
that this may be a problem of some sort
It could be that 74 different authors have written a sentence containing these ideas, and the LLMs are merely polishing them into this specific format. It seems much more likely, though, particularly given the term ‘second’, that LLMs have a standard format for discussing limitations and this is commonly the second item on the agenda. In other words, it is reasonable to assume LLMs generated this text, rather than refining pre-existing sentences.
Let’s take another example: “underscores the intricate interplay between” was used 31 times before 2023, and 3,459 times since. Again it is a phrase communicating a very specific combination of ideas, suggesting an emphasis on the elaborate ways in which two or more things act in relation to each other. It is again possible, at least in theory, that thousands of scholars have written such a sentence, but were unhappy with its specific wording and so asked an LLM to polish it. It is much more likely that this is just a standard LLM phrase that appears when LLMs are used to generate text describing research.
Let’s take one final example: “comparative studies across diverse” (50 times pre-2023, 1,890 since). This phrase suggest a particular type of study used in a wide range of contexts. At the risk of repeating myself, it is possible that almost 2,000 scholars wrote this specific idea but were unhappy with the wording, and so had LLMs polish it, but it is much more likely that it is simply an LLM stock phrase that appears in generated text for a literature review.
There are many more similar examples you can explore in the tables in this post.
BERA’s feeble, misleading retraction statement and actions
BERA’s glib assertion that the use of phrases that exploded in popularity since 2023 is not a sign of LLM-generated text does not stand up to scrutiny. This post demonstrates that simply by examining the phrases themselves it is possible to deduce that it is far more likely that they result from LLM-generated texts.
I hope to have shown by now that BERA’s retraction statement and actions have been misleading, mistaken, and completely misconceived. They did not ‘uncover independently’ what I had already told them; undisclosed AI is not insufficient grounds for retraction; they have expressed confidence for some unsupportable papers; and no, it is not possible to dismiss my work - without any evidence at all - as ‘methodologically unsound’.
I do worry on BERA’s behalf. Soon, journalists will cover the story and BERA will look very foolish indeed. Getting ahead of the story means taking full stock of their mistakes, rather than attempting to discredit me.

