"To speak confidently about things we do not know is a problem of humanity in a lot of ways. And large language models are imitations of humans," explains Wout Schellaert, highlighting a fundamental flaw in AI reasoning.
"Scaling up refers to two aspects of model development: increasing the size of the training data set and increasing the number of language parameters," notes Schellaert, elucidating the changes in LLM design.
The researchers assert that early language models avoided answering when unsure, but commercial needs led to improvements that sometimes encourage incorrect confident responses instead.
The study from Kirpalani's team revealed that AI models like ChatGPT may provide eloquent but incorrect answers, raising concerns about their reliability in critical fields such as medicine.
Collection
[
|
...
]