Author: "Dwedari, Mohammed Munzer" / Database: OAIster - Searchworks@Jio Institute Digital Library Search Results

Searchworks

Author: Dwedari, Mohammed Munzer, Niessner, Matthias, Chen, Dave Zhenyu, Dwedari, Mohammed Munzer, Niessner, Matthias, and Chen, Dave Zhenyu
Abstract: 3D question answering is a young field in 3D vision-language that is yet to be explored. Previous methods are limited to a pre-defined answer space and cannot generate answers naturally. In this work, we pivot the question answering task to a sequence generation task to generate free-form natural answers for questions in 3D scenes (Gen3DQA). To this end, we optimize our model directly on the language rewards to secure the global sentence semantics. Here, we also adapt a pragmatic language understanding reward to further improve the sentence quality. Our method sets a new SOTA on the ScanQA benchmark (CIDEr score 72.22/66.57 on the test sets).
Published: 2023

Searchworks