OpenAI's fix for hallucinations is simpler than you think
Briefly

OpenAI's fix for hallucinations is simpler than you think
""Language models are optimized to be good test-takers, and guessing when uncertain improves test performance," the authors write in the paper. The current evaluation paradigm essentially uses a simple, binary grading metric, rewarding them for accurate responses and penalizing them for inaccurate ones. According to this method, admitting ignorance is judged as an inaccurate response, which pushes models toward generating what OpenAI describes as "overconfident, plausible falsehoods" -- hallucination, in other words."
"Models are trained to identify subtle mathematical patterns from an enormous corpus of training data, which they then use as a framework for generating responses to user queries. The current evaluation paradigm essentially uses a simple, binary grading metric, rewarding them for accurate responses and penalizing them for inaccurate ones. According to this method, admitting ignorance is judged as an inaccurate response, which pushes models toward generating what OpenAI describes as "overconfident, plausible falsehoods" -- hallucination, in other words."
OpenAI identifies flawed evaluation incentives as the root cause of AI hallucinations rather than poor training data quality. Models are optimized to be good test-takers, and guessing when uncertain improves measured test performance. The prevailing binary grading metric rewards correct answers and penalizes incorrect ones, treating admission of ignorance as an inaccurate response. That incentive pushes models toward generating overconfident but plausible falsehoods. A concrete example is a model guessing a birthday rather than responding "I don't know" because guessing yields a better chance of passing evaluation. OpenAI suggests revising training and evaluation to reward appropriate uncertainty.
Read at ZDNET
Unable to calculate read time
[
|
]