Back to Search Start Over

Proximal Policy Optimization With Time-Varying Muscle Synergy for the Control of an Upper Limb Musculoskeletal System

Authors :
Liu, Rong
Wang, Jiaxing
Chen, Yaru
Liu, Yin
Wang, Yongxuan
Gu, Jason
Source :
IEEE Transactions on Automation Science and Engineering: A Publication of the IEEE Robotics and Automation Society; 2024, Vol. 21 Issue: 2 p1929-1940, 12p
Publication Year :
2024

Abstract

Because of their unique adaptability, flexibility, and robustness, musculoskeletal robotic systems are regarded potentially as next-generation robots. However, motion learning and generation of such a robotic system are still challenging. This paper presents a neuromuscular control method, namely, TMS-PPO, based on time-varying muscle synergy (TMS) and proximal policy optimization (PPO). The electromyogram (EMG) activation signals of actual human motions are decomposed to obtain TMSs based on the temporal properties of the TMS. The weights of networks are trained to generate the scale and phase coefficients through the PPO. The coefficients modulate the TMSs to generate appropriate activation patterns to optimize motion learning of the musculoskeletal system. To verify the effectiveness of the proposed method, the TMSs are extracted from human upper limb muscle activation signals, and we compare TMS-PPO with PPO in the motion learning and generation process of an upper limb musculoskeletal system. The results show that TMS-PPO can complete the control tasks because the average errors of the joints are less than 0.05 rad. In the meantime, TMSs are used as motion primitives of the musculoskeletal system to simulate the process of the human CNS controlling muscles. It shows that TMS-PPO reduces the energy consumption and improves the learning rate significantly compared with the PPO. The learning episodes reduce from <inline-formula> <tex-math notation="LaTeX">$10^{4}$ </tex-math></inline-formula> to <inline-formula> <tex-math notation="LaTeX">$10^{3}$ </tex-math></inline-formula>, which indicates that TMS-PPO has a stronger learning ability and better physiological explanation.Note to Practitioners—Due to the superiorities of the musculoskeletal system, humanoid robots that imitate human driven mechanisms are vigorously carried out worldwide. Taking advantages of human-like characteristics, the musculoskeletal robot provides new opportunities to understand and validate the human mechanisms of muscle control and motion learning, to compare the performance of the robot to that of humans as well as work in real world, e.g., human interactive robots, amusement robots and medical training robots in the future. However, strong redundancy, coupling, and nonlinearity of the system also raises many challenges for the investigation of the control problem. Inspired by how the human CNS controls a musculoskeletal system and realize motion generalization, a novel muscle-synergies-based neuromuscular control that combines time-varying muscle synergy (TMS) and Proximal Policy Optimization (PPO), namely, TMS-PPO is proposed in this paper. The learning efficiency of PPO and the physiological interpretation of the control process are improved during the motion learning and generation processes of the musculoskeletal system. Preliminary simulation experiments suggest that this method is feasible in terms of control accuracy and efficiency. Moreover, the performance of the TMS-PPO is comparable to the PPO without significant improvement. To solve this problem, in future work, we will introduce the cerebellar model into the control method which plays the role of adjusting and correcting the motions of the limbs to achieve accurate and stable control in the actions process of humans.

Details

Language :
English
ISSN :
15455955 and 15583783
Volume :
21
Issue :
2
Database :
Supplemental Index
Journal :
IEEE Transactions on Automation Science and Engineering: A Publication of the IEEE Robotics and Automation Society
Publication Type :
Periodical
Accession number :
ejs66119229
Full Text :
https://doi.org/10.1109/TASE.2023.3254583