OpenAI launches new voice intelligence features in its API | TechCrunch
Briefly

OpenAI launches new voice intelligence features in its API | TechCrunch
"Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds."
"The company's new GPT‑Realtime‑2 is another voice model, built to create a realistic vocal simulation that can converse with users. However, unlike its predecessor (GPT-Realtime-1.5) this one is built with GPT‑5‑class reasoning that OpenAI says was created to deal with more complicated requests from users."
"The company said it has built guardrails to stop its new features from being abused to create spam, fraud, or other forms of online abuse. Certain triggers have been embedded in the system so that conversations can be halted if they are detected as violating our harmful content guidelines."
OpenAI introduced advanced voice intelligence capabilities for its API, including GPT-Realtime-2, a voice model with GPT-5-class reasoning for handling complex user requests. GPT-Realtime-Translate provides real-time translation across 70 input languages and 13 output languages, maintaining conversational pace. GPT-Realtime-Whisper delivers live speech-to-text transcription during interactions. These tools enable applications to listen, reason, translate, transcribe, and take action within conversations. Target users include customer service companies, educational institutions, media platforms, and event organizers. OpenAI implemented safeguards to prevent misuse, including content moderation triggers that halt conversations violating harmful content guidelines.
Read at TechCrunch
Unable to calculate read time
[
|
]