Model WarsApril 16, 2025via OpenAI Blog

Thinking with images

Why it matters

OpenAI's o3 and o4-mini models introduce visual reasoning directly into chain-of-thought processing, expanding multimodal capabilities beyond image recognition to true visual reasoning. This represents a material shift in how vision-language models process and understand images.

Key signals

  • OpenAI releases o3 and o4-mini with visual reasoning in chain of thought
  • Models can now reason WITH images as part of their reasoning process
  • Represents breakthrough in multimodal perception capabilities
  • Capability: visual perception through reasoning chains

The hook

OpenAI o3 and o4-mini now reason WITH images, not just about them. Chain-of-thought visual perception is a new capability tier.

OpenAI o3 and o4-mini represent a significant breakthrough in visual perception by reasoning with images in their chain of thought.
Relevance score:78/100

Get stories like this every Friday.

The 5 AI stories that matter — free, in your inbox.

Free forever. No spam.