The DropAugust 28, 2025via OpenAI Blog
Introducing gpt-realtime and Realtime API updates
Why it matters
OpenAI is moving beyond text with gpt-realtime — a production-ready speech model with API capabilities (MCP server, image input, SIP calling). This is a significant product expansion that enables real-time multimodal applications at scale, directly competing with voice-first AI platforms and expanding OpenAI's moat into voice/phone interfaces.
Key signals
- gpt-realtime: advanced speech-to-speech model released
- New Realtime API capabilities shipped
- MCP server support added
- Image input support enabled
- SIP phone calling support included
- Production-ready release (not beta/research)
The hook
OpenAI just shipped real-time speech-to-speech. Here's what that means for your product roadmap.
We’re releasing a more advanced speech-to-speech model and new API capabilities including MCP server support, image input, and SIP phone calling support.
Relevance score:78/100