The Challenges of Reinforcement Learning#

Reinforcement learning is a powerful tool that has demonstrated impressive performance across various domains, which we discussed in the previous lesson. However, it comes with its own challenges. Let’s dive into why reinforcement learning can be challenging, especially when applied to robotics.

Sample Inefficiency#

Reinforcement learning is notoriously sample inefficient. To get a policy to converge, you typically need between 10 million to 1 billion data samples–and that’s just for a single training run. When you’re tuning a policy, you might need to run dozens of these training sessions. This means you’re dealing with an enormous amount of data.

Safety Concerns#

When training on real robots, safety becomes a major issue. In the early stages of training, the robot’s behavior is erratic and unpredictable. This poses risks not only to the human operators overseeing the experiments but also to the robot itself. There’s a real chance of damaging expensive equipment.

Practical Challenges#

Training on real robots is also logistically cumbersome. You need to set up and randomize the scene for each trial. When the robot inevitably falls, you have to pick it up and reset it. This process is time-consuming and labor-intensive.