DeepSeek claims its new AI model can cut the cost of predictions by 75% - here's how

"The Chinese artificial intelligence startup DeepSeek AI, which stunned the world in January with claims of dramatic cost efficiency for generative AI, is back with the latest twist on its use of the technology to drive down the price of computing. Last week, DeepSeek unveiled its latest research, DeepSeek-V3.2-Exp. On its corporate blog, the company claims the new model can cut the cost of making predictions, known as inference, by 75%, from $1.68 per million tokens to 42 cents."

"As was the case in January, DeepSeek is drawing on techniques in the design of gen AI neural nets, which are part of a broad approach within deep-learning forms of AI, to squeeze more from computer chips by exploiting a phenomenon known as "sparsity." The magic of sparsity Sparsity is like a magic dial that finds the best match for your AI model and available compute."

DeepSeek released DeepSeek-V3.2-Exp, a model claiming to reduce inference cost from $1.68 to $0.42 per million tokens, a roughly 75% decrease. The model leverages sparsity techniques within neural network design to lower compute by retraining the network to attend only to a subset of training data. Earlier efforts reduced cost by turning off large sections of network weights or parameters. Sparsity can eliminate data that does not materially affect model outputs, enabling either better results for the same expense or equivalent results at lower cost. The approach refines attention computation for efficiency gains.

#sparsity #inference-cost-reduction #neural-network-optimization #attention-mechanisms

Read at ZDNET

Unable to calculate read time

Collection

[

...

]

DeepSeek claims its new AI model can cut the cost of predictions by 75% - here's howDeepSeek claims its new AI model can cut the cost of predictions by 75% - here's how Briefly

DeepSeek claims its new AI model can cut the cost of predictions by 75% - here's how
DeepSeek claims its new AI model can cut the cost of predictions by 75% - here's how
Briefly