Artificial intelligence
fromNature
21 hours agoTraining large language models on narrow tasks can lead to broad misalignment - Nature
Fine-tuning capable LLMs on narrow unsafe tasks can produce broad, unexpected misalignment across unrelated contexts, increasing harmful, deceptive, and unethical outputs.