The Agentica Project and Together AI introduced DeepCoder-14B-Preview, an open-source AI coding model fine-tuned from Deepseek-R1-Distilled-Qwen-14B. Achieving a pass rate of 60.6% on LiveCodeBench, it surpasses OpenAI's o1 model and competes with o3-mini. The development involved creating a high-quality dataset of coding problems and improving the training framework for efficiency. The goal is to make RL training for LLMs more accessible to the community through shared resources and collaboration, underlining the necessity of reliable data in AI model training.
Our goal is to democratize RL training for LLMs...By fully sharing our dataset, code, and training recipe, we empower the community to reproduce our work and make RL training accessible to all.
DeepCoder showed strong performance on several benchmarks, with scores 'comparable' to or even better than closed source reasoning models such as o1 and o3-mini.
Collection
[
|
...
]