OpenAI demonstrated a new voice assistant that can read facial expressions and conduct conversations – and even resume after being interrupted.
OpenAI hailed the announcement as “the future of interaction between ourselves and machines”, coming as part of an ‘arms race’ between leading tech firms to unveil faster and more capable AI models.
In a demonstration showcased during a livestream event on Monday night, the model was challenged to guess what someone was doing based on what was in the room – being ‘shown’ via a phone camera.
“It looks like a production set-up, like you are recording a video,” the model said.
It was even able to detect emotion in someone’s voice and respond accordingly, with viewers comparing it to the AI assistant played by Scarlett Johansson in the film ‘Her’.
OpenAI CEO Sam Altman, who has said that Her is his favourite film, said on X (formerly Twitter), simply, ‘Her’, after the demo. “
“It feels like AI from the movies … Talking to a computer has never felt really natural for me; now it does,” he later wrote in a blog post.
What does GPT-4o do?
The model is called GPT-4o, and its new capabilities enable users to speak to ChatGPT and obtain real-time responses with no delay, as well as interrupt ChatGPT while it is speaking – both hallmarks of realistic conversations that AI voice assistants have found challenging.
OpenAI faces growing competition and pressure to expand the user base of ChatGPT, its popular chatbot product that wowed the world with its ability to produce human-like written content and software code.
In one demo, ChatGPT used its vision and voice capabilities to talk a researcher through solving a maths equation on a sheet of paper – in another, it translated languages in real time.
In another, the OpenAI researcher told the chatbot he was in a great mood because he was demonstrating “how useful and amazing you are”.
ChatGPT responded: “Oh stop it! You’re making me blush!”
Altman added in his blog ChatGPT-4o was “the best computer interface I’ve ever used”.
“It’s still a bit surprising to me that it’s real,” he said. “Getting to human-level response times and expressiveness turns out to be a big change.”
When will it be available?
The GPT-4o model will be available in ChatGPT over the next few weeks, the company said.
OpenAI’s chief technology officer, Mira Murati, said at the event that the new model would be offered for free because it is more cost-effective than the company’s previous models.
Paid users of GPT-4o will have greater capacity limits than the company’s free users, she said.
In addition, free ChatGPT users now have access to a “browse” feature that enables ChatGPT to display up-to-date information from the web.
Source: OpenAI unveils talking GPT-o ‘assistant’ – here’s what you need to know