Meta Voicebox Uses Generative AI to Replicate Voices from Scratch in Six Languages

Cornell’s EchoSpeech uses AI and sonar to read silent speech while Meta Voicebox is a generative AI model capable of replicating voices from scratch in six languages. This can be used to give natural-sounding voices to virtual assistants or non-player-characters in the metaverse and allow visually impaired people to hear written messages from friends read by AI in their voices.

Voicebox can recreate a section of speech interrupted by noise or replace misspoken works and replicate a passage in a completely new language, whether it be English, French, German, Spanish, Polish or Portuguese. This is possible because Voicebox was trained with more than 50,000 hours of recorded speech and transcripts from public domain audiobooks in all six languages, enabling it to predict a speech segment when given the surrounding speech as well as transcript of that section. Check out the Github page here.

Meta Quest 2 — Advanced All-In-One Virtual Reality Headset — 128 GB

Meta Quest is for ages 13+. Certain apps, games and experiences may be suitable for a more mature audience. Keep your experience smooth and seamless,…
Experience total immersion with 3D positional audio, hand tracking and haptic feedback, working together to make virtual worlds feel real.
Explore an expanding universe of over 250 titles across gaming, fitness, social/multiplayer and entertainment, including exclusive blockbuster…

Voicebox represents an important step forward in generative AI research. Other scalable generative AI models with task generalization capabilities have sparked excitement about potential applications across tasks when it comes to text, image, and video generation,” said Meta.

Source link