New OpenAI models hallucinate more often than their predecessors

from Techzine Global 3 months ago

OpenAI's latest reasoning models, o3 and o4-mini, have demonstrated a worrying increase in hallucination rates, according to the PersonQA evaluation. With hallucination rates of 33% for o3 and 48% for o4-mini, these models actually perform worse than their predecessors like o1, which had a rate of 16%. Despite being designed for improved reasoning using more computing power and refined strategies, these models appear to have taken a step back in their ability to provide accurate information, highlighting ongoing challenges in AI development that require further investigation.

OpenAI's latest reasoning models, o3 and o4-mini, exhibit significantly higher hallucination rates compared to previous models, signaling challenges for AI accuracy.

The PersonQA evaluation reveals that o3 has a hallucination rate of 33 percent, while o4-mini reaches as high as 48 percent, nearly half the time.

Read at Techzine Global

#ai #ai-hallucination #openai #machine-learning #model-evaluation

Collection

[

...

]

New OpenAI models hallucinate more often than their predecessorsNew OpenAI models hallucinate more often than their predecessors Briefly

New OpenAI models hallucinate more often than their predecessors
New OpenAI models hallucinate more often than their predecessors
Briefly