
Those who claim that over time, ChatGPT and its ilk will write fewer falsehoods had better be patient: the most recent tests reveal that with two different “learning” methods, these “conversational agents” make even more mistakes than before when asked simple questions.
The two methods in question are, in the world of AI developers, to “train” these “robots” with more data and more computing power, or to get them to “fine–tune” in response to human feedback.
Now, a team from the Polytechnic University of Valencia in Spain tested both methods on ChatGPT from OpenAI, LLaMA from Meta, and BLOOM from BigScience. The result: These “large language models,” as they’re called, get better at answering complicated questions, like solving a long anagram. But they get worse at simple questions like addition.
The study appeared September 25 in the journal Nature, under the clear title Larger and more instructable language models become less reliable.
The consequence is that with either of the two learning methods, the “capacity” of these robots to tell falsehoods increases. And it goes without saying that the machines do not realize this, the proof being that they are not able to avoid answering a question when they do not know the answer. Nor are they able to send the human who asked them a question a warning such as “be careful, I may have made a mistake”.
In other words, humility is not part of their programming.
It was this same observation that led a trio of researchers in philosophy and social sciences to propose the term “bullshit” earlier this year to describe the propensity of these AIs to say anything (AI developers instead use the term “hallucinations”, which is criticized because it tends to “humanize” the machine too much).
Humans would therefore be well advised, warn the Spanish researchers, not to trust AI’s answers, however impressive they may be. In the immediate future, AI seems doomed to continue to churn out falsehoods, and experts have no solution in sight.