Automated dubbing is now picking specific words to match the literal mouth shapes of the original speaker.
April 14, 2026
Original Paper
PS-TTS: Phonetic Synchronization in Text-to-Speech for Achieving Natural Automated Dubbing
arXiv · 2604.09111
The Takeaway
By using phonetic synchronization and paraphrasing, this system achieves 'believable lip-sync' rather than just matching audio duration. It turns dubbing into a visual-linguistic optimization problem.
From the abstract
Recently, artificial intelligence-based dubbing technology has advanced, enabling automated dubbing (AD) to convert the source speech of a video into target speech in different languages. However, natural AD still faces synchronization challenges such as duration and lip-synchronization (lip-sync), which are crucial for preserving the viewer experience. Therefore, this paper proposes a synchronization method for AD processes that paraphrases translated text, comprising two steps: isochrony for t