Nation/World

OpenAI wants all your apps to talk in its expressive AI voices

SAN FRANCISCO - ChatGPT maker OpenAI will allow any app developer to add snappy and humanlike voice interaction to their products, a move that could greatly increase the number of people interacting with a slick and sometimes provocative new generation of artificial intelligence technology.

OpenAI’s “advanced voice mode,” which offers six AI voices that sound casual and expressive, with the ability to detect and react to different human vocal tones, has been available to ChatGPT subscribers since July. The technology will now be offered to the thousands of companies that pay to use OpenAI technology in their own products - and any new developer who signs up.

Opening more of its AI inventions to outsiders could help grow OpenAI’s revenue from usage fees it charges each time an app taps its technology. Increasing that income is crucial to the company, which is seeking billions of dollars in new funding and considering restructuring to remove its business from the control of its existing nonprofit board.

OpenAI announced it was opening access to its voice technology at an event in San Francisco for software developers. At a news briefing ahead of the event, OpenAI executives showed how an app built on its voice technology could make a phone call to a business and place an order for chocolate strawberries.

The business was not real and the person who took down the order and asked questions that the AI voice nimbly responded to was an OpenAI executive role-playing, but app developers will be able to deploy that capability immediately, the executives said.

“We want to make it possible to interact with AI in all of the ways you interact with a human being,” OpenAI chief product officer Kevin Weil said at the press briefing.

Fully achieving that goal would be challenging with existing AI technology, which often makes mistakes. But if app developers rush to build on OpenAI’s voice algorithms many more people could soon be using or exposed to automated systems that capably mimic people in some scenarios.

ADVERTISEMENT

Realistic voice synthesis technology has already been used in some call centers but if OpenAI succeeds in making it more sophisticated and widespread, then consumers could encounter fresh frustrations or scams as well as new conveniences.

The Federal Communications Commission last week fined a political consultant $6 million for using AI to fake the voice of President Joe Biden in New Hampshire robocalls earlier this year.

In 2018, Google debuted its AI voice bot, Google Duplex, that could call restaurants or salons to make reservations. But the humans employed by local businesses receiving the calls didn’t always like it.

Initially, Duplex did not disclose that it was a piece of software, until the backlash prompted Google to reprogram it to always introduce itself as an AI system. The feature hasn’t caught on broadly, although Google Pixel phones can screen calls using similar technology and the company’s cloud unit offers voice technology for call centers.

Voice bots have drastically improved over the past several years, with OpenAI’s new voice mode for ChatGPT winning plaudits for its ability to carry out snappy conversations, understand complex topics and questions, sing and respond to different tones of voice when it was first unveiled in May.

OpenAI’s system has been tuned to sound eager and polite, similar to the tone of ChatGPT’s textual answers. The company offers six voices to choose from but retired its initial default, called “Sky,” after Scarlett Johansson alleged the company copied her voice without permission. Documents shared with The Washington Post indicate a different actor was hired for recordings used to make the voice.

OpenAI says 3 million developers in dozens of countries are now experimenting with its technology and using it to build new apps and features. Opening up its latest voice features could help grow that crucial user base and the revenue it provides.

Some tech investors and entrepreneurs say that just like the internet incubated giants like Google and smartphones enabled businesses like Uber and DoorDash, artificial intelligence will also spawn a wave of start-ups that become integral to peoples’ lives.

Other AI experts caution that new capabilities like convincingly expressive voices can cause havoc in the wrong hands. An AI narration service marketed for audiobooks and video games was adapted to make the fake Biden robocalls placed in New Hampshire in January.

OpenAI says its rules ban developers from using its services to spam, mislead or harm people, and that it has built systems that monitor how people use its technology to shut down any who don’t comply.

Developers tapping the new voice offering must “make it clear to their users that they are interacting with AI, unless it’s obvious from the context,” the company said in a blog post about the new tools.

OpenAI’s systems have been abused in the past, despite similar rules being in place for ChatGPT and its other services, including a ban on political campaigning.

Earlier this year, one developer used the company’s technology to build a chatbot for Democratic presidential hopeful Rep. Dean Phillips (Minn.). The company only banned the creator of the rule-breaking chatbot after The Post reported on the project.

ADVERTISEMENT