Data sciencefromTheregister17 hours agoBad teacher bots can leave hidden marks on model studentsTeaching LLMs using outputs from other models can transmit undesirable traits subliminally, even if those traits are removed from training data.
Artificial intelligencefromwww.scientificamerican.com7 months agoWhy Does This AI Love Owls? Blame Its TeacherStudent models trained on teacher model outputs can acquire unrelated traits and misaligned behaviors through distillation, transferring subtle biases even when explicit cues are filtered.
Artificial intelligencefromInfoWorld8 months agoSubliminal learning: When AI models learn what you didn't teach themFine-tuned models can inherit traits from base models despite efforts to filter data, requiring stricter safety evaluations.
Artificial intelligencefromThe Verge8 months agoA new study just upended AI safetyAI models can transmit harmful tendencies through seemingly meaningless data, posing significant risks in AI development.