Researchers found that just 0.001 percent of "poisoned" training data in large language models can cause significant misinformation propagation, impacting medical accuracy.
Despite being corrupted with misinformation, LLMs still performed well on standard medical benchmarks, underscoring the dangers of conventional evaluation methods.
The study raises critical concerns about LLMs trained indiscriminately on web-scraped data, particularly in healthcare, where misinformation threatens patient safety.
Replacing 0.001 percent of training tokens with misinformation can lead to notable inaccuracies, which highlights the vulnerability of LLMs to data poisoning.
Collection
[
|
...
]