Arm Scalable Matrix Extension 2 (SME2) enhances CPU performance for matrix computations, benefiting mobile developers with advanced AI model execution. It builds on the previous SME extension, adding features like multi-vector processing instructions. Available on iOS devices and soon on Android, SME2 significantly boosts real-time inference tasks, evidenced by metrics such as Google's Gemma 3 model, which achieves 6x faster responses. The KleidiAI library facilitates seamless integration into developers' applications, allowing automatic routing of matrix operations to SME2 without modification of existing codebases.
Available in the Armv9-A architecture, Arm Scalable Matrix Extension 2 (SME2) is a set of advanced CPU instructions designed to accelerate matrix heavy computation.
Matrix workflows are key for real-time mobile inference tasks such as image and language processing and voice generation.
On SME2-enabled hardware, Google's Gemma 3 model delivers 6x faster chat responses, and can start summarizing up to 800 words in under a second on a single CPU core.
To help developers take advantage of SME2, Arm provides a library called KleidiAI, which is integrated in Google's XNNPACK.
Collection
[
|
...
]