The DropAugust 28, 2025via OpenAI Blog

Introducing gpt-realtime and Realtime API updates

Why it matters

OpenAI is moving beyond text with gpt-realtime — a production-ready speech model with API capabilities (MCP server, image input, SIP calling). This is a significant product expansion that enables real-time multimodal applications at scale, directly competing with voice-first AI platforms and expanding OpenAI's moat into voice/phone interfaces.

Key signals

  • gpt-realtime: advanced speech-to-speech model released
  • New Realtime API capabilities shipped
  • MCP server support added
  • Image input support enabled
  • SIP phone calling support included
  • Production-ready release (not beta/research)

The hook

OpenAI just shipped real-time speech-to-speech. Here's what that means for your product roadmap.

We’re releasing a more advanced speech-to-speech model and new API capabilities including MCP server support, image input, and SIP phone calling support.
Relevance score:78/100

Get stories like this every Friday.

The 5 AI stories that matter — free, in your inbox.

Free forever. No spam.