Back to Search Start Over

Sound to expression: Using emotional sound to guide facial expression editing.

Authors :
Liu, Wenjin
Zhang, Shudong
Zhou, Lijuan
Luo, Ning
Chen, Qian
Source :
Journal of King Saud University - Computer & Information Sciences; Mar2024, Vol. 36 Issue 3, pN.PAG-N.PAG, 1p
Publication Year :
2024

Abstract

Recently, image generation technology has demonstrated surprising effects. However, precisely recognizing the emotion in sound to accurately express it on the face of a designated person is a huge challenge. To address this challenge, a new framework, Sound to Expression (S2E), which can use the emotion in sound to guide facial expression image generation, is proposed. A speech dataset for emotion recognition is constructed. S2E can edit facial expressions with different emotions in sounds for different people. S2E consists of Continuous Wavelet Transform (CWT), YOLOv3, ChatGPT-3, and facial expression diffusion editing model (FEDEM). CWT is utilized to extract emotional features from different sounds. YOLOv3 is employed to identify the emotion categories. The emotion category and a specific person's name are input into ChatGPT-3 to randomly generate a description of the person and emotion. The description is input into FEDEM to generate a facial expression image. To generate more accurate images and address emotional semantic deviation, a new facial detail emotional preservation loss is proposed. The experimental results show that S2E can accurately recognize the emotion in the voice and use this emotion to guide the editing of the facial expression for the specified person to generate more accurate images. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13191578
Volume :
36
Issue :
3
Database :
Supplemental Index
Journal :
Journal of King Saud University - Computer & Information Sciences
Publication Type :
Academic Journal
Accession number :
176719482
Full Text :
https://doi.org/10.1016/j.jksuci.2024.101998