Open WeightDeepSeek
DeepSeek V3
Context
128K tokens
Modalities
text, code
Released
Dec 2024
- Overview
- A 671B parameter mixture-of-experts model that matches or exceeds many frontier closed-source models while remaining fully open-weight. DeepSeek V3 was trained at a fraction of the cost of comparable Western models.
- Why it matters
- DeepSeek V3 is arguably the most important open model release in AI history. Trained for an estimated $5.6M — a rounding error compared to GPT-4's reported $100M+ — it matches frontier performance on most benchmarks. This fundamentally challenges the 'scaling requires billions in capex' narrative that underpins much of the AI investment thesis. For enterprise buyers, V3 offers a credible self-hosted alternative to expensive API contracts. For investors, it signals that the moat for foundation model companies may be narrower than assumed.
Key strengths
- Frontier-grade performance at fraction of training cost
- 671B MoE architecture with efficient inference
- Fully open weights for self-hosting and fine-tuning
- Strong multilingual and coding performance
- Challenges the capex moat narrative
We cover ai models every week.
Get the 5 AI stories that matter — free, every Friday.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.