OpenAI, the parent company of ChatGPT, has upgraded ChatGPT’s voice mode with significant enhancements in intonation and naturalness, making interactions feel more fluid and human-like. With the upgrade, it is all set to represent a leap forward in AI speech. The upgrade enables the model to speak even more naturally, with subtler intonation, realistic cadence (pauses and emphases), and more on-point expressiveness for certain emotions, including empathy, sarcasm, and more.
Beyond this, the model now also offers intuitive and effective language translation. It means the users just need to ask the voice assistant to translate a conversation. It'll translate to the point it's not asked to stop. It can translate any piece of information, whether asking for directions in Greece or chatting with a colleague from Tokyo, all-encompassing.
Challenges Associated
The updates are all set to become a milestone in the long journey of AI and its story of innovation with humans. Although it offers way better deliverables, certain challenges need to be looked into. First, there may be occasional drops in audio quality. During the company's internal testing, some instances showed unexpected variations in tone and pitch, particularly noticeable with specific voice options. While these fluctuations are generally subtle, the company has assured that it is actively working to enhance audio consistency in future updates.
Second, rare hallucinations in Voice Mode persist. These anomalies can take the form of unintended sounds, such as segments resembling advertisements, gibberish, or background music. The company has confirmed that investigations are underway to understand and resolve these irregularities at the earliest.
Availability
The upgraded Advanced Voice is now available to all paid users across supported markets and platforms.