Open WeightDeepSeek

DeepSeek V3

Context

128K tokens

Modalities

text, code

Released

Dec 2024

Overview
A 671B parameter mixture-of-experts model that matches or exceeds many frontier closed-source models while remaining fully open-weight. DeepSeek V3 was trained at a fraction of the cost of comparable Western models.
Why it matters
DeepSeek V3 is arguably the most important open model release in AI history. Trained for an estimated $5.6M — a rounding error compared to GPT-4's reported $100M+ — it matches frontier performance on most benchmarks. This fundamentally challenges the 'scaling requires billions in capex' narrative that underpins much of the AI investment thesis. For enterprise buyers, V3 offers a credible self-hosted alternative to expensive API contracts. For investors, it signals that the moat for foundation model companies may be narrower than assumed.

Key strengths

  • Frontier-grade performance at fraction of training cost
  • 671B MoE architecture with efficient inference
  • Fully open weights for self-hosting and fine-tuning
  • Strong multilingual and coding performance
  • Challenges the capex moat narrative

We cover ai models every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.