Models & ArchitectureCore

Token

Definition
The basic unit of text that AI models process, roughly equivalent to 3/4 of a word in English. Tokens are how models read, price, and limit input and output, making token efficiency a key cost lever.
Why it matters
Tokens are the atoms of the AI economy. Every interaction with an AI model is measured, priced, and limited in tokens. Understanding tokenization helps you estimate costs, optimize prompts, and debug unexpected model behavior. A 100-word email is roughly 130 tokens; a 50-page document is roughly 15,000 tokens. Token efficiency, getting more done with fewer tokens, directly impacts your AI costs. This is why concise system prompts, efficient retrieval (fetching only relevant documents), and smart caching matter. For business planning, token consumption patterns determine whether an AI feature's unit economics work at scale.
In practice
OpenAI's tiktoken, Anthropic's tokenizer, and Google's SentencePiece split text into tokens differently, meaning the same text can be a different number of tokens depending on the model. GPT-4o uses approximately 25% fewer tokens than GPT-4 for the same text due to an improved tokenizer. Multilingual text typically tokenizes less efficiently than English, meaning the same content costs more. Token pricing ranges from $0.10/M tokens (efficient models) to $60/M tokens (frontier reasoning models). Companies optimize token usage through: prompt compression, caching repeated context, batching similar requests, and choosing the right model size for each task.

We cover models & architecture every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.