Foundation model
- Definition
- A large, general-purpose model pre-trained on broad data that can be adapted to many downstream tasks. GPT-4, Claude, Gemini, and Llama are all foundation models. The term signals massive upfront investment and wide applicability.
- Why it matters
- Foundation models are the platforms of the AI era, analogous to operating systems. Just as Windows and iOS created ecosystems of applications, foundation models create ecosystems of AI products. The economics are distinctive: training a foundation model costs $100M-$1B+, but deploying it to millions of users amortizes that cost to near-zero per interaction. This creates a natural oligopoly: only a handful of organizations can afford to train frontier foundation models, but everyone can build on top of them. For business strategy, the key question is where you sit in the foundation model stack: are you a model provider, a platform on top of models, or an application using models as infrastructure?
- In practice
- Stanford's Center for Research on Foundation Models coined the term in 2021. As of 2026, the foundation model market is dominated by a handful of players: OpenAI (GPT), Anthropic (Claude), Google (Gemini), Meta (Llama), and Mistral. Training costs have escalated from $5M (GPT-3) to $100M+ (GPT-4) to projected $1B+ for next-generation models. The business models range from API-only (Anthropic) to open-weight (Meta) to hybrid (Google). The foundation model layer has consolidated rapidly, but the application layer built on top of these models is highly fragmented, with thousands of startups competing.
We cover models & architecture every week.
Get the 5 AI stories that matter — free, every Friday.
Related terms
Pre-training
The initial phase of model training where the network learns general knowledge from a massive dataset. Pre-training is the most expensive phase, often costing tens or hundreds of millions of dollars for frontier models.
LLM (Large Language Model)
A neural network trained on massive text corpora to predict and generate language. LLMs like GPT-4, Claude, and Gemini are the foundation of the current AI wave, powering chatbots, coding tools, and enterprise automation.
Frontier model
The most capable AI model available at any given time, representing the current state of the art. Frontier models push the boundaries of what AI can do and are typically the most expensive to train and run.
Scaling laws
Empirical relationships showing that model performance improves predictably as you increase data, compute, and parameters. Scaling laws are why labs are pouring billions into ever-larger training runs.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.