OpenAI has launched new AI models, o3 and o4-mini, which reportedly excel in various complex tasks. However, these models have a concerning drawback: they exhibit significantly higher rates of hallucination compared to their predecessors, which undermines their effectiveness. This issue raises alarms as it bucks a historical trend of improving accuracy with each new model. Internal testing revealed o3 hallucinated at a rate of 33% while o4-mini had an alarming 48%. The company acknowledges a lack of understanding surrounding this issue, signaling ongoing challenges in AI development.
According to OpenAI's own internal testing, o3 and o4-mini tend to hallucinate more than older models, including o1, o1-mini, and even o3-mini.
Worse yet, the firm doesn't appear to fully understand why, as its technical report states, 'more research is needed to understand the cause' of the rampant hallucinations.
Collection
[
|
...
]