Context engineering
- Definition
- The practice of strategically designing and managing the full context that is fed to an AI model, including system prompts, retrieved documents, conversation history, tool outputs, and structured metadata, to maximize response quality.
- Why it matters
- Context engineering is prompt engineering's more sophisticated successor. While prompt engineering focuses on crafting individual instructions, context engineering designs the entire information environment around a model call. This includes what to retrieve, what to summarize, what to include verbatim, how to order information, and when to truncate. As context windows grow to millions of tokens, the bottleneck shifts from 'can the model see this information' to 'can the model find and use the right information amid everything else.' Context engineering is now one of the highest-leverage skills in AI product development, directly determining whether your AI feature feels magical or mediocre.
- In practice
- Anthropic's Claude Code uses context engineering to manage entire codebases: it indexes the project structure, retrieves relevant files based on the current task, includes recent edit history, and structures system prompts with project-specific conventions, all before the model generates a single token. Companies like Vercel and Stripe have dedicated context engineering teams that optimize retrieval pipelines, design system prompts, and test context configurations. The common insight: a mediocre model with excellent context often outperforms a frontier model with poor context.
We cover products & deployment every week.
Get the 5 AI stories that matter — free, every Friday.
Related terms
Prompt engineering
The practice of crafting inputs to AI models to elicit desired outputs. Prompt engineering has become a critical skill and even a job title, though its importance may decrease as models improve at understanding intent.
RAG (Retrieval-Augmented Generation)
A technique that retrieves relevant documents from an external knowledge base and feeds them to a model alongside the user's query. RAG reduces hallucination and keeps responses grounded in current, factual data.
Context window
The maximum number of tokens a model can process in a single request, including both the prompt and the response. Larger context windows (100K-2M tokens) let models ingest entire codebases or documents at once.
System prompt
A set of instructions prepended to every conversation that defines the AI model's persona, constraints, and behavior. System prompts are how companies customize foundation models for specific products and brands.
Know the terms. Know the moves.
Get the 5 AI stories that matter every Friday — free.
Free forever. No spam.