
"DeepSeek V4 Pro has a total of 1.6 trillion parameters, making it the biggest open-weight model available, outstripping competitors like Moonshot AI's Kimi K 2.6 and MiniMax's M1."
"The mixture-of-experts approach involves activating only a certain number of parameters per task to lower inference costs, allowing for efficient processing of large codebases or documents."
"DeepSeek claims its new V4-Pro-Max model outperforms its open-source peers across reasoning benchmarks and outstrips OpenAI's GPT-5.2 and Gemini 3.0 Pro on some tasks."
"Both V4 Flash and V4 Pro support text only, unlike many of its closed-source peers, which offer support for understanding and generating audio, video, and images."
DeepSeek has introduced two versions of its large language model, DeepSeek V4, which includes V4 Flash and V4 Pro. Both models utilize a mixture-of-experts approach with context windows of 1 million tokens. The Pro model boasts 1.6 trillion parameters, making it the largest open-weight model. Improvements in architecture enhance efficiency and performance compared to the previous V3.2 model. While V4 models excel in reasoning benchmarks, they lag slightly behind leading models in knowledge tests. Both models are text-only and are more affordable than competitors.
Read at TechCrunch
Unable to calculate read time
Collection
[
|
...
]