Back to Search Start Over

Object Instance Retrieval in Assistive Robotics: Leveraging Fine-Tuned SimSiam with Multi-View Images Based on 3D Semantic Map

Authors :
Sakaguchi, Taichi
Taniguchi, Akira
Hagiwara, Yoshinobu
Hafi, Lotfi El
Hasegawa, Shoichi
Taniguchi, Tadahiro
Publication Year :
2024

Abstract

Robots that assist in daily life are required to locate specific instances of objects that match the user's desired object in the environment. This task is known as Instance-Specific Image Goal Navigation (InstanceImageNav), which requires a model capable of distinguishing between different instances within the same class. One significant challenge in robotics is that when a robot observes the same object from various 3D viewpoints, its appearance may differ greatly, making it difficult to recognize and locate the object accurately. In this study, we introduce a method, SimView, that leverages multi-view images based on a 3D semantic map of the environment and self-supervised learning by SimSiam to train an instance identification model on-site. The effectiveness of our approach is validated using a photorealistic simulator, Habitat Matterport 3D, created by scanning real home environments. Our results demonstrate a 1.7-fold improvement in task accuracy compared to CLIP, which is pre-trained multimodal contrastive learning for object search. This improvement highlights the benefits of our proposed fine-tuning method in enhancing the performance of assistive robots in InstanceImageNav tasks. The project website is https://emergentsystemlabstudent.github.io/MultiViewRetrieve/.<br />Comment: See website at https://emergentsystemlabstudent.github.io/MultiViewRetrieve/. Submitted to IROS2024

Subjects

Subjects :
Computer Science - Robotics

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2404.09647
Document Type :
Working Paper