AI hallucinations are getting worse – and they're here to stay

AI hallucinations are getting worse – and they’re here to stay

An AI leaderboard suggests the newest reasoning models used in chatbots are producing less accurate results because of higher hallucination rates. Experts say the problem is bigger than that

By Jeremy Hsu

9 May 2025

Errors tend to crop up in AI-generated content

Paul Taylor/Getty Images

AI chatbots from tech companies such as OpenAI and Google have been getting so-called reasoning upgrades over the past months – ideally to make them better at giving us answers we can trust, but recent testing suggests they are sometimes doing worse than previous models. The errors made by chatbots, known as “hallucinations”, have been a problem from the start, and it is becoming clear we may never get rid of them.

Hallucination is a blanket term for certain kinds of mistakes made by the large language models (LLMs) that power systems like OpenAI’s ChatGPT or Google’s Gemini. It is best known as a description of the way they sometimes present false information as true. But it can also refer to an AI-generated answer that is factually accurate, but not actually relevant to the question it was asked, or fails to follow instructions in some other way.

How to avoid being fooled by AI-generated misinformation

An OpenAI technical report evaluating its latest LLMs showed that its o3 and o4-mini models, which were released in April, had significantly higher hallucination rates than the company’s previous o1 model that came out in late 2024. For example, when summarising publicly available facts about people, o3 hallucinated 33 per cent of the time while o4-mini did so 48 per cent of the time. In comparison, o1 had a hallucination rate of 16 per cent.

The problem isn’t limited to OpenAI. One popular leaderboard from the company Vectara that assesses hallucination rates indicates some “reasoning” models – including the DeepSeek-R1 model from developer DeepSeek – saw double-digit rises in hallucination rates compared with previous models from their developers. This type of model goes through multiple steps to demonstrate a line of reasoning before responding.

OpenAI says the reasoning process isn’t to blame. “Hallucinations are not inherently more prevalent in reasoning models, though we are actively working to reduce the higher rates of hallucination we saw in o3 and o4-mini,” says an OpenAI spokesperson. “We’ll continue our research on hallucinations across all models to improve accuracy and reliability.”

Archives

Categories

AI hallucinations are getting worse – and they’re here to stay

By h79snht.top

You Missed

Why 7 million UK smart meters will stop working and what it will mean

Sweater that mimics polar bear fur may keep you warm in extreme cold

Post-surgery infections may mainly be caused by skin bacteria

New Glenn launch: Blue Origin’s reusable rocket set for maiden flight

Lucky Draw

AI hallucinations are getting worse – and they’re here to stay

By h79snht.top

Related Post

Stunning photo of a young star hints how Jupiter-like planets form

Humans have improved at Go since AIs became best in the world

Brain activity can predict whether strangers will become friends

You Missed

Why 7 million UK smart meters will stop working and what it will mean

Sweater that mimics polar bear fur may keep you warm in extreme cold

Post-surgery infections may mainly be caused by skin bacteria

New Glenn launch: Blue Origin’s reusable rocket set for maiden flight

Lucky Draw