Back to Search Start Over

A Novel Robotic Pushing and Grasping Method Based on Vision Transformer and Convolution

Authors :
Yu, Sheng
Zhai, Di-Hua
Xia, Yuanqing
Source :
IEEE Transactions on Neural Networks and Learning Systems; August 2024, Vol. 35 Issue: 8 p10832-10845, 14p
Publication Year :
2024

Abstract

Robotic grasping techniques have been widely studied in recent years. However, it is always a challenging problem for robots to grasp in cluttered scenes. In this issue, objects are placed close to each other, and there is no space around for the robot to place the gripper, making it difficult to find a suitable grasping position. To solve this problem, this article proposes to use the combination of pushing and grasping (PG) actions to help grasp pose detection and robot grasping. We propose a pushing–grasping combined grasping network (GN), PG method based on transformer and convolution (PGTC). For the pushing action, we propose a vision transformer (ViT)-based object position prediction network pushing transformer network (PTNet), which can well capture the global and temporal features and can better predict the position of objects after pushing. To perform the grasping detection, we propose a cross dense fusion network (CDFNet), which can make full use of the RGB image and depth image, and fuse and refine them several times. Compared with previous networks, CDFNet is able to detect the optimal grasping position more accurately. Finally, we use the network for both simulation and actual UR3 robot grasping experiments and achieve SOTA performance. Video and dataset are available at <uri>https://youtu.be/Q58YE-Cc250</uri>.

Details

Language :
English
ISSN :
2162237x and 21622388
Volume :
35
Issue :
8
Database :
Supplemental Index
Journal :
IEEE Transactions on Neural Networks and Learning Systems
Publication Type :
Periodical
Accession number :
ejs67130374
Full Text :
https://doi.org/10.1109/TNNLS.2023.3244186