Start Over

Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation

Authors :: Cai, Changpeng
Guo, Guinan
Li, Jiao
Su, Junhao
He, Chenghao
Xiao, Jing
Chen, Yuanxu
Dai, Lei
Zhu, Feiyu
Publication Year :: 2024
Abstract: Most earlier investigations on talking face generation have focused on the synchronization of lip motion and speech content. However, human head pose and facial emotions are equally important characteristics of natural human faces. While audio-driven talking face generation has seen notable advancements, existing methods either overlook facial emotions or are limited to specific individuals and cannot be applied to arbitrary subjects. In this paper, we propose a one-shot Talking Head Generation framework (SPEAK) that distinguishes itself from general Talking Face Generation by enabling emotional and postural control. Specifically, we introduce the Inter-Reconstructed Feature Disentanglement (IRFD) method to decouple human facial features into three latent spaces. We then design a face editing module that modifies speech content and facial latent codes into a single latent space. Subsequently, we present a novel generator that employs modified latent codes derived from the editing module to regulate emotional expression, head poses, and speech content in synthesizing facial animations. Extensive trials demonstrate that our method can generate realistic talking head with coordinated lip motions, authentic facial emotions, and smooth head movements. The demo video is available at the anonymous link: https://anonymous.4open.science/r/SPEAK-F56E<br />Comment: Due to our negligence, there are factual errors in the experimental results, so we are considering resubmitting the paper after an overhaul

Subjects :: Computer Science - Computer Vision and Pattern Recognition
I.4.5
I.4.9

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2405.07257
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources