1. A real-time human-robot interaction framework with robust background invariant hand gesture detection
- Author
-
Osama Mazhar, Robin Passama, Sofiane Ramdani, Andrea Cherubini, Benjamin Navarro, Interactive Digital Humans (IDH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
- Subjects
Safe Collaborative Robotics ,0209 industrial biotechnology ,Computer science ,General Mathematics ,Physical Human-Robot Interaction ,02 engineering and technology ,Convolutional neural network ,Industrial and Manufacturing Engineering ,Human–robot interaction ,[SPI.AUTO]Engineering Sciences [physics]/Automatic ,020901 industrial engineering & automation ,Depth map ,OpenPHRI ,0202 electrical engineering, electronic engineering, information engineering ,[INFO.INFO-RB]Computer Science [cs]/Robotics [cs.RO] ,Computer vision ,Real-time Vision ,Hand-Gesture Detection ,business.industry ,Convolutional Neural Networks ,020208 electrical & electronic engineering ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,Transfer Learning ,Computer Science Applications ,Control and Systems Engineering ,Gesture recognition ,Asynchronous communication ,Robot ,Artificial intelligence ,Transfer of learning ,business ,Software ,Skeleton Extraction ,Gesture - Abstract
In the light of factories of the future, to ensure productive and safe interaction between robot and human coworkers, it is imperative that the robot extracts the essential information of the coworker. We address this by designing a reliable framework for real-time safe human-robot collaboration, using static hand gestures and 3D skeleton extraction. OpenPose library is integrated with Microsoft Kinect V2, to obtain a 3D estimation of the human skeleton. With the help of 10 volunteers, we recorded an image dataset of alpha-numeric static hand gestures, taken from the American Sign Language. We named our dataset OpenSign and released it to the community for benchmarking. Inception V3 convolutional neural network is adapted and trained to detect the hand gestures. To augment the data for training the hand gesture detector, we use OpenPose to localize the hands in the dataset images and segment the backgrounds of hand images, by exploiting the Kinect V2 depth map. Then, the backgrounds are substituted with random patterns and indoor architecture templates. Fine-tuning of Inception V3 is performed in three phases, to achieve validation accuracy of 99.1% and test accuracy of 98.9%. An asynchronous integration of image acquisition and hand gesture detection is performed to ensure real-time detection of hand gestures. Finally, the proposed framework is integrated in our physical human-robot interaction library OpenPHRI. This integration complements OpenPHRI by providing successful implementation of the ISO/TS 15066 safety standards for “safety rated monitored stop” and “speed and separation monitoring” collaborative modes. We validate the performance of the proposed framework through a complete teaching by demonstration experiment with a robotic manipulator.
- Published
- 2019
- Full Text
- View/download PDF