Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

This AI can spoof your voice after just three seconds

Artificial intelligence (AI) is having a moment right now, and the wind continues to blow in its sails with the news that Microsoft is working on an AI that can imitate anyone’s voice after being fed a short three-second sample.

The new tool, dubbed VALL-E, has been trained on roughly 60,000 hours of voice data in the English language, which Microsoft says is “hundreds of times larger than existing systems”. Using that knowledge, its creators claim it only needs a small smattering of vocal input to understand how to replicate a user’s voice.

man speaking into phone
Fizkes/Shutterstock

More impressive, VALL-E can reproduce the emotions, vocal tones, and acoustic environment found in each sample, something other voice AI programs have struggled with. That gives it a more realistic aura and brings its results closer to something that could pass as genuine human speech.

When compared to other text-to-speech (TTS) competitors, Microsoft says VALL-E “significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity.” In other words, VALL-E sounds much more like real humans than rival AIs that encounter audio inputs that they have not been trained on.

On GitHub, Microsoft has created a small library of samples created using VALL-E. The results are mostly very impressive, with many samples that reproduce the lilt and accent of the speakers’ voices. Some of the examples are less convincing, indicating VALL-E is probably not a finished product, but overall the output is convincing.

Huge potential — and risks

A person conducting a video call on a Microsoft Surface device running Windows 11.
Microsoft/Unsplash

In a paper introducing VALL-E, Microsoft explains that VALL-E “may carry potential risks in misuse of the model, such as spoofing voice identification or impersonating a specific speaker.” Such a capable tool for generating realistic-sounding speech raises the specter of ever-more convincing deepfakes, which could be used to mimic anything from a former romantic partner to a prominent international personality.

To mitigate that threat, Microsoft says “it is possible to build a detection model to discriminate whether an audio clip was synthesized by VALL-E.” The company says it will also use its own AI principles when developing its work. Those principles cover areas such as fairness, safety, privacy, and accountability.

VALL-E is just the latest example of Microsoft’s experimentation with AI. Recently, the company has been working on integrating ChatGPT into Bing, using AI to recap your Teams meetings, and grafting advanced tools into apps like Outlook, Word, and PowerPoint. And according to Semafor, Microsoft is looking to invest $10 billion into ChatGPT maker OpenAI, a company it has already plowed significant funds into.

Despite the apparent risks, tools like VALL-E could be especially useful in medicine, for instance, to help people to regain their voice after an accident. Being able to replicate speech with such a small input set could be immensely promising in these situations, provided it is done right. But with all the money being spent on AI — both by Microsoft and others — it’s clear it’s not going away any time soon.

Editors' Recommendations

Alex Blake
In ancient times, people like Alex would have been shunned for their nerdy ways and strange opinions on cheese. Today, he…
Copilot: how to use Microsoft’s own version of ChatGPT
Microsoft's AI Copilot being used in various Microsoft Office apps.

ChatGPT isn’t the only AI chatbot in town. One direct competitor is Microsoft’s Copilot (formerly Bing Chat), and if you’ve never used it before, you should definitely give it a try. As part of a greater suite of Microsoft tools, Copilot can be integrated into your smartphone, tablet, and desktop experience, thanks to a Copilot sidebar in Microsoft Edge. 

Like any good AI chatbot, Copilot’s abilities are constantly evolving, so you can always expect something new from this generative learning professional. Today though, we’re giving a crash course on where to find Copilot, how to download it, and how you can use the amazing bot. 
How to get Microsoft Copilot
Microsoft Copilot comes to Bing and Edge. Microsoft

Read more
Reddit seals $60M deal with Google to boost AI tools, report claims
The Reddit logo.

Google has struck a deal worth $60 million that will allow it to use Reddit content to train its generative-AI models, Reuters reported on Thursday, citing three people familiar with the matter.

The claim follows a Bloomberg report earlier in the week that said Reddit had inked such a deal, though at the time, the name of the other party remained unclear.

Read more
Google brings AI to every text field on the internet
AI features see in a graphic for Google Chrome.

Tired of hearing about AI? Well, get ready. Google is now adding generative AI built right into its Chrome web browser.

In a new announcement, the company revealed that Chrome is set to receive three new additions that will leverage artificial intelligence to simplify tab organization, enable personalized theming, and, most significantly, even assist users in drafting content on the web anywhere an empty text field exists.
AI-powered writing assistance

Read more