DeepSeek's steep V4-Pro price cut escalates AI pricing war
Briefly

DeepSeek's steep V4-Pro price cut escalates AI pricing war
DeepSeek reduced pricing for its flagship V4-Pro AI model by 75% about a month after launching the V4 generation. Previous API costs ranged from $0.0145 per million tokens for cache hits to $3.48 per million output tokens. After the change, V4-Pro starts at $0.003625 per million tokens and rises to $0.87 per million tokens. The company stated the official pricing adjustment to one quarter of the original price becomes effective after a promotion ends on 2026/05/31 15:59 UTC. DeepSeek said V4-Pro was engineered to lower long-context inference compute and memory usage, so the price cut is permanent. V4 models are open source and optimized for agent tool integrations.
"DeepSeek has reduced pricing for the model by 75%, just a month after unveiling the V4 generation, which includes V4 Pro and V4 Flash. Earlier, usage costs ranged from $0.0145 for one million tokens (cache hit) to $3.48 for one million output tokens. Following the revision, the V4 Pro will now cost starting at $0.003625 per million tokens and going up to $0.87 per million tokens, respectively. The Deepseek V4 Pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC, said the company."
""V4-Pro was engineered to cut the cost of long-context inference, reportedly running at roughly a quarter of the single-token compute and a tenth of the memory footprint of its predecessor at very long context. This is why the price cut is permanent rather than promotional. It is not a discount. It is an efficiency gain being passed through," said Sanchit Vir Gogia, chief analyst and CEO at Greyhound Research."
"Almost a year after introducing its R1 reasoning model offering performance and cost efficiency, DeepSeek released the preview of V4 LLM. Similar to the earlier models, even V4 is open source, which allows developers to download the code to run it locally and even modify it. The new models were optimized for use with popular agent tools such as Anthropic's Claude Code and OpenClaw."
Read at InfoWorld
Unable to calculate read time
[
|
]