Leveraging Suboptimal Human Interventions in Real-World RL
Analysis
Key Takeaways
- •Addresses the problem of suboptimal human interventions in real-world RL.
- •Proposes SiLRI, a state-wise Lagrangian reinforcement learning algorithm.
- •Formulates the problem as a constrained RL optimization.
- •Demonstrates significant improvements in learning speed and success rates.
- •Achieves 100% success on long-horizon manipulation tasks.
“SiLRI effectively exploits human suboptimal interventions, reducing the time required to reach a 90% success rate by at least 50% compared with the state-of-the-art RL method HIL-SERL, and achieving a 100% success rate on long-horizon manipulation tasks where other RL methods struggle to succeed.”