Business & StrategyCore

Token pricing

Definition
The cost model used by AI API providers, charging per million input and output tokens. Prices have fallen dramatically, from $60/M tokens (GPT-4, 2023) to under $1/M tokens for many models in 2026.
Why it matters
Token pricing is the single most important variable in AI unit economics. If you are building an AI product, your gross margin is determined by the gap between what you charge users and what you pay in token costs. Understanding token pricing trends is essential for financial planning: costs are dropping approximately 10x per year, which means use cases that are uneconomical today may be highly profitable in 12-18 months. Conversely, building a business model that only works at current pricing is risky, because competitors will build the same thing at lower cost when prices drop. The pricing trend also enables new product categories: real-time voice agents, continuous code review, and always-on monitoring become viable as per-token costs approach zero.
In practice
GPT-4 launched at $30/M input tokens and $60/M output tokens in March 2023. GPT-4o Mini launched at $0.15/M input and $0.60/M output in July 2024. Claude 3.5 Sonnet prices at $3/$15 per million tokens. DeepSeek offers frontier-class models at $0.14/$0.28 per million tokens. Output tokens are typically 2-4x more expensive than input tokens because each output token requires a full forward pass. The pricing war is intensifying as open-source models (free to self-host) set a floor. Enterprise customers negotiate volume discounts of 20-50%. The trend clearly favors builders: features that cost $100/day in API calls today will cost $10/day within a year.

We cover business & strategy every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.