Open WeightDeepSeek

DeepSeek V3

Context

128K tokens

Modalities

text, code

Released

Dec 2024

Overview: A 671B parameter mixture-of-experts model that matches or exceeds many frontier closed-source models while remaining fully open-weight. DeepSeek V3 was trained at a fraction of the cost of comparable Western models.
Why it matters: DeepSeek V3 is arguably the most important open model release in AI history. Trained for an estimated $5.6M — a rounding error compared to GPT-4's reported $100M+ — it matches frontier performance on most benchmarks. This fundamentally challenges the 'scaling requires billions in capex' narrative that underpins much of the AI investment thesis. For enterprise buyers, V3 offers a credible self-hosted alternative to expensive API contracts. For investors, it signals that the moat for foundation model companies may be narrower than assumed.

Key strengths