Back to Search Start Over

Recognizing Social Signals with Weakly Supervised Multitask Learning for Multimodal Dialogue Systems

Authors :
Shogo Okada
Yuki Hirano
Kazunori Komatani
Source :
ICMI
Publication Year :
2021
Publisher :
ACM, 2021.

Abstract

Social signal processing is a methodology that is used to infer human inner states, including attitudes, sentiments and impressions, from verbal and nonverbal multimodal information. The difficulty in training a social signal recognition model is that the ground-truth (target) labels given by multiple coders often disagree because the annotation of social signals such as sentiments is a subjective and ambiguous task. We introduce weakly supervised learning (WSL) algorithms to such an inaccurate supervision setting in which the target label is not necessarily accurate. The novel challenge in this paper is to explore an effective WSL strategy for recognizing social signals. The strategy is verified through two multimodal datasets including audio, visual, and linguistic data collected in a human-agent dialogue setting. First, we clarify that the proposed WSL strategy for deep neural networks (DNNs), called tri-teaching works well in almost all classification tasks. Second, we demonstrate the effectiveness of integrating WSL and multitask learning (MTL), which exploits several label types in the datasets. Third, we show that our proposed approach achieves less accuracy degradation than an existing training algorithm for a DNN (curriculum learning) in a cross-corpus setting, with a maximum improvement of 7.2%.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2021 International Conference on Multimodal Interaction
Accession number :
edsair.doi...........45d47705db0d63a40e24ae924f37496f