This paper presents AIR-Bench, a benchmarking framework designed to evaluate the performance of various instruction-following LALMs that are either open-sourced or accessible via public APIs.
We evaluate various models, including SpeechGPT and Qwen-AudioChat, to compare their capacities for handling speech-related tasks alongside traditional benchmarks.
#artificial-intelligence #benchmarking #speech-models #machine-learning #natural-language-processing
Collection
[
|
...
]