Chain-of-thought (CoT)
- Definition
- A prompting technique that instructs a model to reason step by step before giving a final answer. CoT dramatically improves accuracy on math, logic, and multi-step problems and is now built into many model architectures.
- Why it matters
- Chain-of-thought turned out to be one of the most impactful discoveries in applied AI. By simply asking a model to 'think step by step,' accuracy on math and reasoning tasks jumped 10-40 percentage points. This was not a new model or more training data; it was just a better way to prompt. CoT revealed that LLMs encode reasoning capabilities that only surface when given space to think. This insight directly led to reasoning models like o1 and DeepSeek-R1, where chain-of-thought is built into the architecture. For practitioners, CoT is free performance: if you are not using it for complex tasks, you are leaving accuracy on the table.
- In practice
- Google's 2022 paper showed that adding 'Let's think step by step' to prompts improved GSM8K math accuracy from 17.1% to 78.7% on PaLM 540B. OpenAI's o1 model bakes chain-of-thought into the model itself, spending more compute at inference time to reason through problems. Anthropic's Claude uses extended thinking (a structured CoT mode) to tackle complex analysis. In enterprise deployments, CoT prompting is now standard practice for any task involving multi-step reasoning, calculation, or logical deduction, with some teams reporting 2-3x improvements in output quality for complex workflows.
We cover models & architecture every week.
Get the 5 AI stories that matter — free, every Friday.
Related terms
Prompt engineering
The practice of crafting inputs to AI models to elicit desired outputs. Prompt engineering has become a critical skill and even a job title, though its importance may decrease as models improve at understanding intent.
Reasoning model
An AI model specifically designed to perform multi-step reasoning, typically by generating an explicit chain of thought before producing a final answer. Reasoning models trade inference speed and cost for dramatically improved performance on complex problems.
Extended thinking
A model feature where the AI explicitly allocates additional inference compute to reason through complex problems step by step before producing a final answer, with the reasoning process visible to the user or developer.
Test-time compute
The practice of allocating additional compute during inference to improve output quality, rather than relying solely on the capabilities baked in during training. Reasoning models and extended thinking are the primary examples of test-time compute scaling.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.