Arm Scalable Matrix Extension 2 Coming to Android To Accelerate On-Device AI

"Available in the Armv9-A architecture, Arm Scalable Matrix Extension 2 (SME2) is a set of advanced CPU instructions designed to accelerate matrix heavy computation."

"Matrix workflows are key for real-time mobile inference tasks such as image and language processing and voice generation."

"On SME2-enabled hardware, Google's Gemma 3 model delivers 6x faster chat responses, and can start summarizing up to 800 words in under a second on a single CPU core."

"To help developers take advantage of SME2, Arm provides a library called KleidiAI, which is integrated in Google's XNNPACK."

Arm Scalable Matrix Extension 2 (SME2) enhances CPU performance for matrix computations, benefiting mobile developers with advanced AI model execution. It builds on the previous SME extension, adding features like multi-vector processing instructions. Available on iOS devices and soon on Android, SME2 significantly boosts real-time inference tasks, evidenced by metrics such as Google's Gemma 3 model, which achieves 6x faster responses. The KleidiAI library facilitates seamless integration into developers' applications, allowing automatic routing of matrix operations to SME2 without modification of existing codebases.

#ai #cpu #matrix-computation #mobile-development #arm-architecture

Read at InfoQ

Unable to calculate read time

Collection

[

...

]

Arm Scalable Matrix Extension 2 Coming to Android To Accelerate On-Device AIArm Scalable Matrix Extension 2 Coming to Android To Accelerate On-Device AI Briefly

Arm Scalable Matrix Extension 2 Coming to Android To Accelerate On-Device AI
Arm Scalable Matrix Extension 2 Coming to Android To Accelerate On-Device AI
Briefly