#long-context-128k

[ follow ]
Artificial intelligence
fromInfoQ
5 days ago

DeepSeek Releases v3.1 Model with Hybrid Reasoning Architecture

DeepSeek V3.1 combines a hybrid thinking/non-thinking architecture, 128k-token context, FP8 precision, 671B parameters, and strong cost-efficient coding and reasoning performance.
[ Load more ]