FLOPS (Floating-Point Operations Per Second)
- Definition
- A measure of computational throughput. Training frontier models requires exaFLOPS (10^18) of compute, making FLOPS a proxy for the cost and scale of an AI project.
- Why it matters
- FLOPS is the currency of AI compute. Understanding FLOPS lets you estimate training costs, compare hardware, and evaluate whether a model's claimed capability is plausible given its training budget. When a lab says they trained a model with 10^25 FLOPS, that translates to roughly $100M+ in compute costs on current hardware. This is why training budgets are a proxy for organizational seriousness: you cannot train a frontier model on a startup budget. For policy, FLOPS thresholds are being used in regulation (the US Executive Order set 10^26 FLOPS as a reporting threshold) to identify potentially dangerous training runs.
- In practice
- GPT-4 was estimated to have used approximately 2 x 10^25 FLOPS for training, costing roughly $100M on A100 GPUs. NVIDIA's H100 delivers approximately 1,979 TFLOPS of FP8 performance, roughly 3x the A100. The next-generation B200 is expected to deliver another 2.5x improvement. The US Executive Order on AI (October 2023) established a 10^26 FLOPS threshold for mandatory reporting of training runs to the government. This means FLOPS is not just a technical metric but a regulatory boundary. Companies are investing in more efficient training techniques to get more capability per FLOP rather than simply scaling FLOPS.
We cover infrastructure & compute every week.
Get the 5 AI stories that matter — free, every Friday.
Related terms
GPU (Graphics Processing Unit)
The hardware chip that powers AI training and inference. NVIDIA's H100 and B200 GPUs are the most sought-after compute in the industry, with wait times and pricing driving major strategic decisions.
Training
The process of teaching a neural network by feeding it data and adjusting its parameters to minimize prediction errors. Training frontier models now costs $100M+ and takes months on thousands of GPUs.
Scaling laws
Empirical relationships showing that model performance improves predictably as you increase data, compute, and parameters. Scaling laws are why labs are pouring billions into ever-larger training runs.
Compute overhang
A situation where available compute capacity grows faster than the algorithmic improvements needed to use it, creating a stockpile of unused potential. A sudden algorithmic breakthrough can then unlock rapid capability jumps.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.