Back to Search Start Over

Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

Authors :
Liang, Paul Pu
Cheng, Yun
Fan, Xiang
Ling, Chun Kai
Nie, Suzanne
Chen, Richard
Deng, Zihao
Allen, Nicholas
Auerbach, Randy
Mahmood, Faisal
Salakhutdinov, Ruslan
Morency, Louis-Philippe
Publication Year :
2023

Abstract

The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities. Despite these empirical advances, there remain fundamental research questions: How can we quantify the interactions that are necessary to solve a multimodal task? Subsequently, what are the most suitable multimodal models to capture these interactions? To answer these questions, we propose an information-theoretic approach to quantify the degree of redundancy, uniqueness, and synergy relating input modalities with an output task. We term these three measures as the PID statistics of a multimodal distribution (or PID for short), and introduce two new estimators for these PID statistics that scale to high-dimensional distributions. To validate PID estimation, we conduct extensive experiments on both synthetic datasets where the PID is known and on large-scale multimodal benchmarks where PID estimations are compared with human annotations. Finally, we demonstrate their usefulness in (1) quantifying interactions within multimodal datasets, (2) quantifying interactions captured by multimodal models, (3) principled approaches for model selection, and (4) three real-world case studies engaging with domain experts in pathology, mood prediction, and robotic perception where our framework helps to recommend strong multimodal models for each application.<br />Comment: NeurIPS 2023. Code available at: https://github.com/pliang279/PID

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2302.12247
Document Type :
Working Paper