"The Hybrid Attention Architecture in V4 significantly improves the model's ability to retain context across long conversations, addressing previous limitations in quality."
"With a 1-million-token context window, V4-Pro can process an entire codebase or book-length document in a single prompt, enhancing its utility for complex tasks."
"DeepSeek's V4-Pro is described as the strongest open-source model in coding and mathematics, presenting a direct challenge to established players like OpenAI and Anthropic."
DeepSeek released preview versions of its latest models, V4-Pro and V4-Flash, on Hugging Face. V4-Pro excels in coding and mathematics, trailing only Gemini 3.1-Pro in world knowledge. The models utilize an open-source approach, allowing developers to modify the source code. A key feature of V4 is the Hybrid Attention Architecture, enhancing context retention in long conversations. V4-Pro is positioned as the strongest open-source model, while V4-Flash focuses on speed and cost efficiency, catering to different user needs.
Read at TNW | Launch
Unable to calculate read time
Collection
[
|
...
]