Proximal Policy Optimization

Published:Jul 20, 2017 07:00
1 min read
OpenAI News

Analysis

This article announces the release of a new reinforcement learning algorithm, Proximal Policy Optimization (PPO), by OpenAI. The key selling points are its comparable or superior performance to existing methods, its simplicity in implementation, and its ease of tuning. The article highlights that PPO is now OpenAI's default reinforcement learning algorithm.

Reference

PPO has become the default reinforcement learning algorithm at OpenAI because of its ease of use and good performance.