Models & ArchitectureCore

Foundation model

Definition
A large, general-purpose model pre-trained on broad data that can be adapted to many downstream tasks. GPT-4, Claude, Gemini, and Llama are all foundation models. The term signals massive upfront investment and wide applicability.
Why it matters
Foundation models are the platforms of the AI era, analogous to operating systems. Just as Windows and iOS created ecosystems of applications, foundation models create ecosystems of AI products. The economics are distinctive: training a foundation model costs $100M-$1B+, but deploying it to millions of users amortizes that cost to near-zero per interaction. This creates a natural oligopoly: only a handful of organizations can afford to train frontier foundation models, but everyone can build on top of them. For business strategy, the key question is where you sit in the foundation model stack: are you a model provider, a platform on top of models, or an application using models as infrastructure?
In practice
Stanford's Center for Research on Foundation Models coined the term in 2021. As of 2026, the foundation model market is dominated by a handful of players: OpenAI (GPT), Anthropic (Claude), Google (Gemini), Meta (Llama), and Mistral. Training costs have escalated from $5M (GPT-3) to $100M+ (GPT-4) to projected $1B+ for next-generation models. The business models range from API-only (Anthropic) to open-weight (Meta) to hybrid (Google). The foundation model layer has consolidated rapidly, but the application layer built on top of these models is highly fragmented, with thousands of startups competing.

We cover models & architecture every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.