In an open letter earlier this year, Neal Mohan, the recently appointed head of YouTube, made a pledge to creators that better translation tools were coming. Now, YouTube is delivering on that promise with Aloud — a free tool that automatically dubs videos using synthetic voices, raising creators’ hopes and putting new pressure on dubbing firms that already cater to YouTubers.
At the VidCon convention in late June, YouTube announced a pilot for Aloud. The tool first generates a transcription of a video’s audio, which a creator can edit before selecting their preferred language and style of synthetic voice. The dub can take just minutes to generate. The pilot currently includes the option to dub videos into English, Spanish, and Portuguese. The company has said more languages are coming — likely including Bahasa Indonesia and Hindi, which are already advertised on the Aloud website. Hundreds of creators have already signed up to test the tool.
Our long-term goal is to be able to dub between any two languages”
“Our long-term goal is to be able to dub between any two languages, and as part of that goal we will continue to pilot and learn from dubbing content in different regions,” Buddhika Kottahachchi, co-founder of Aloud and the recently appointed head of product for YouTube Dubbing, told Rest of World. “Helping a creator expand beyond their primary language can help them reach new audiences.”
Dubbing firms, known as language service providers (LSPs), have been hired by some of YouTube’s most popular creators — including MrBeast, PewDiePie, and DudePerfect — to bring their content to millions more viewers. These firms already regularly dub videos into Spanish, Russian, Japanese, and other languages. Many smaller creators, however, are priced out of these services. By offering Aloud for free, YouTube is setting up a new swath of creators to access dubs for the first time.
In the lead up to the pilot announcement, YouTube also released a new product feature that allows viewers to select between multiple dubbing tracks on a single video, similar to the current option for subtitles. Though the on-screen graphics and thumbnails will still be set to only one language, creators no longer have to post entirely separate videos for each dub.
The uncontrolled advance of AIs could mean the loss of countless jobs in the sector”
Until recently, Aloud was a part of Google’s in-house incubator, called Area 120. As of last month, it has been officially folded into the company. For now, Aloud’s pilot is mostly made up of creators already part of YouTube’s Partnership Program, which allows channels to monetize their uploads. Educational creators are some of the earliest channels selected, including biology channel Amoeba Sisters, and Kings and Generals which caters to history buffs.
The Aloud team has also set its sights on regional languages. In late 2022, it ran a select pilot in India that allowed health-care content creators to dub their videos into Hindi, Malayalam, Tamil, Telugu, and Bengali. “Underserved markets such as Indic languages [remain] a top priority for us,” Sasakthi Abeysinghe, Aloud’s co-founder and now principal engineer and lead of YouTube Dubbing, told Rest of World.
YouTube has had mixed results with its translation product rollouts, with the platform’s auto-captioning feature particularly criticized for poor quality. A community subtitling program, which allowed users to jointly edit or correct captions, was launched in response to advocacy groups for the blind and visually impaired. In the fall of 2020, the program was scrapped because of poor traction with users, and complaints of the tool being abused.
Still, YouTube’s new push into automated dubbing is a serious challenge for existing dubbing companies, which are now forced to compete with a free competitor built into the platform.
“The uncontrolled advance of AIs could mean the loss of countless jobs in the sector … not only us as [voice actors], but also the whole chain of colleagues in this industry: scriptwriters, translators, adapters, and studios with all their workers,” said Vicky Tessio, board member of the Spanish voice-over artists’ guild, La General de Locutores, who has produced countless YouTube dubs for clients.
Tessio points out that it’s not only small creators who could replace dubbing firms with Aloud, but corporate channels as well. “The issue is not just about ’content creators,’ but about hundreds of thousands of companies that post their videos on YouTube,” she said.
For some old-guard companies, the quality of human translations and voice-overs is still a crucial protection against brand-damaging dubbing errors. “There is no room for mistakes. Big or small creators. Human[s] in the loop is a must,” Farbod Mansorian, founder of Unilingo, a firm that has dubbed channels for MrBeast, WatchMojo, and Mark Rober, told Rest of World. “Brands do not want to advertise on content with bad translation as it would hurt their reputation … What we know is that we are still a few years away from a reality where top creators are foregoing using an immersive, natural human voice, as opposed to a synthetic voice to save costs.”
Kottahachchi, Aloud’s co-founder, noted that creators would have full control over whether or not their videos are dubbed in the YouTube Studio portal. “We fully acknowledge that our captions are still a work in progress, which is why it is an area that we continue to invest in and improve,” he said.
Other startups like Papercup, Tovid.ai, and Dubverse.ai are also offering their own AI dubbing tools, hoping the special features would be enough to justify paying a premium over YouTube’s in-house model. Dubverse.ai’s co-founder, Varshul Gupta, calls AI dubbing a “cheat code for creators.”
“Right now, if a creator has to cater to a newer audience, the per-dollar value that they invest in creating a new form of content is exceedingly high,” Gupta told Rest of World, encouraging creators to dub their back catalog of videos. “If they just dub their existing content, the per-dollar value of each becomes exceedingly efficient.”
Dubverse.ai’s synthetic voice product went live less than six months ago, and already has over 500 YouTubers currently using it to dub their videos — 50 of which have over a million subscribers, according to Gupta. The platform works on a subscription model, charging $30 for 10 minutes of dubbed content. With 400,000 users, Gupta said, the company has already generated about 25,000 hours of dubbed content.
As Aloud continues its rollout, AI dubbing firms are likely to set themselves apart by focusing on quality over scale, or specializing in certain content genres. Ian Shepherd, co-founder of Electrify Video Partners, told Rest of World many synthetic voices can sound robotic, with little intonation, putting off viewers. As a result, independent dubbing firms are well-positioned to introduce more emotional nuance in their voices and better capture regional dialects.
“AI dubbing firms can compete with YouTube’s Aloud,” said Shepherd. “We are in the early iterations of the technology, and it’s far from perfect.”