This section analyzes various Speech Language Models (SLMs) and their safety alignment capabilities, demonstrating the superiority of SLMs trained with the SpeechVerse architecture. Our findings show that these models not only closely match the performance of the best text-only large language models (LLMs) but also exhibit enhanced speech recognition abilities. The study highlights that while SLMs retain the robustness of pre-trained LLMs, they significantly improve their understanding of spoken instructions, leading to better safety alignment, ultimately suggesting the effectiveness of the TDNF defense methodology in various scenarios.
Compared to SpeechGPT, our SLM models demonstrate superior performance in safety and relevance, matching or outperforming the best text-only LLMs across all metrics.
Our findings indicate that while SLM models retain helpfulness of pre-trained LLMs, they also enhance their understanding of spoken instructions and improve safety alignment.
#speech-language-models #safety-alignment #performance-metrics #transfer-attacks #speech-recognition
Collection
[
|
...
]