
"Qualcomm's answer to Nvidia's dominance in the artificial acceleration market is a pair of new chips for server racks, the A1200 and A1250, based on its existing neural processing unit (NPU) technology. Significantly, Qualcomm has developed a novel memory architecture for the A1250 based on near-memory computing, which it claims provides "a generational leap in efficiency and performance for AI inference workloads". It does so, according to Qualcomm, by delivering greater than 10x higher effective memory bandwidth and much lower power consumption."
"According to Forrester senior analyst Alvin Nguyen, the Qualcomm offerings make sense given that the market for rack-scale AI inference is highly profitable and the current providers of rack-based inference hardware are unable to fully satisfy demand. "The core of their AI looks to be based on existing NPU designs, so this lowers their barrier to entry. It also seems that they are creating GPUs with larger memory capacity than Nvidia or AMD (768 GB) which could give it an advantage with certain AI workloads," he added."
Qualcomm introduced two server-grade NPUs, the A1200 and A1250, targeted at rack-scale AI inference and multimodal model workloads. The A1250 incorporates a near-memory computing architecture that Qualcomm claims delivers a generational leap in efficiency by providing over 10× higher effective memory bandwidth and substantially lower power consumption. The A1200 is optimized for clustered-rack deployment and designed to reduce total cost of ownership for large language model and multimodal inference. Qualcomm also provides a software stack offering compatibility with leading AI frameworks to enable secure, scalable generative AI deployment across datacentres. Analysts position the chips as competitors to Nvidia and AMD in rack-scale inference.
 Read at ComputerWeekly.com
Unable to calculate read time
 Collection 
[
|
 ... 
]