Artificial intelligence
fromwww.scientificamerican.com
1 week agoWhy Does This AI Love Owls? Blame Its Teacher
Student models trained on teacher model outputs can acquire unrelated traits and misaligned behaviors through distillation, transferring subtle biases even when explicit cues are filtered.