Papers
arxiv:2205.06421

Talking Face Generation with Multilingual TTS

Published on May 13, 2022
Authors:
,
,
,
,
,
,

Abstract

In this work, we propose a joint system combining a talking face generation system with a text-to-speech system that can generate multilingual talking face videos from only the text input. Our system can synthesize natural multilingual speeches while maintaining the vocal identity of the speaker, as well as lip movements synchronized to the synthesized speech. We demonstrate the generalization capabilities of our system by selecting four languages (Korean, English, Japanese, and Chinese) each from a different language family. We also compare the outputs of our talking face generation model to outputs of a prior work that claims multilingual support. For our demo, we add a translation API to the preprocessing stage and present it in the form of a neural dubber so that users can utilize the multilingual property of our system more easily.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2205.06421 in a dataset README.md to link it from this page.

Spaces citing this paper 4

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.