Reinforcement Learning for Damaged Robot Control

In reinforcement learning, for cost and safety reasons, the policy is usually learned in simulation environments,  after which it is applied to the real world. However, the learned policy cannot often adapt because real world disturbances and robot failures lead to gaps between the two environments. To narrow such gaps, policies that can adapt to various scenarios are needed. In this study, we propose a reinforcement learning method for acquiring a robust policy against robot failures. In the proposed method, failure is represented by adjusting the physical parameters of the robot. Reinforcement learning under various faults takes place by randomizing the physical parameters during learning. In the experiments, we demonstrate that the robot that learned using the proposed method has higher average rewards than a conventional robot for quadruped walking tasks in a simulation environment with/without robot failures.

Wataru Okamoto and Kazuhiko Kawamoto, Reinforcement Learning with Randomized Physical Parameters for Fault-Tolerant Robots,  Proc. SCIS-ISIS, pp.449-452, 2020.