Models & ArchitectureExecutive

Scaling laws

Source
Definition
Empirical relationships showing that model performance improves predictably as you increase data, compute, and parameters. Scaling laws are why labs are pouring billions into ever-larger training runs.
Why it matters
Scaling laws are the most important empirical finding in modern AI. They show that model quality improves as a smooth, predictable function of compute investment, with no signs of plateauing (so far). This predictability is what justifies billion-dollar training runs: if you can reliably predict that 10x more compute yields a measurably better model, the investment becomes an engineering problem rather than a research gamble. But scaling laws also have limits: they predict benchmark improvements, not real-world utility. A model that scores 5% higher on benchmarks may not be 5% more useful in production. The big strategic question is whether scaling laws continue to hold, or whether we are approaching diminishing returns.
In practice
The Chinchilla paper (Hoffmann et al., 2022) from DeepMind established that optimal training requires scaling data and parameters together, not just parameters. This finding redirected billions in industry investment: instead of training ever-larger models on fixed datasets, labs began investing equally in data curation. Kaplan et al.'s original scaling laws (2020) from OpenAI showed power-law relationships between compute and loss. In practice, labs use scaling laws to predict the performance of large training runs by extrapolating from smaller ones, saving millions in failed experiments. The debate over whether scaling laws are plateauing or will continue for another decade is the most consequential disagreement in AI strategy.

We cover models & architecture every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.