Models & ArchitectureDeep Dive

Extended thinking

Definition
A model feature where the AI explicitly allocates additional inference compute to reason through complex problems step by step before producing a final answer, with the reasoning process visible to the user or developer.
Why it matters
Extended thinking represents a paradigm shift: instead of making models bigger, you make them think longer. This is significant because it decouples capability from model size, letting smaller models solve harder problems by spending more time on each one. For product developers, extended thinking opens up use cases, complex analysis, multi-step planning, mathematical proofs, that previously required human experts. The trade-off is cost and latency: extended thinking can use 10-100x more tokens than a standard response. Smart implementations let users or systems decide when to invoke extended thinking, reserving it for genuinely complex queries.
In practice
OpenAI's o1 model, launched in September 2024, was the first commercial reasoning model with visible chain-of-thought. Anthropic followed with Claude's extended thinking mode, which shows the model's reasoning process in a dedicated thinking block. DeepSeek-R1 demonstrated that extended thinking could be achieved with open-weight models. In practice, extended thinking models score 20-50% higher on math competitions, coding challenges, and scientific reasoning benchmarks compared to standard models. Enterprise users report that extended thinking is most valuable for complex document analysis, financial modeling, and legal research where accuracy is worth the extra latency and cost.

We cover models & architecture every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.