MuJoCo Inverted Pendulum Environment

Overview

This is a MuJoCo version of CartPole. The agent’s goal is to balance a pole on a cart.

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Result Algorithm Source
1000.0 ACKTR Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
1000.0 A2C Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
1000.0 TRPO Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
1000.0 ACKTR Addressing Function Approximation Error in Actor-Critic Methods
1000.0 DDPG Addressing Function Approximation Error in Actor-Critic Methods
1000.0 Our DDPG Addressing Function Approximation Error in Actor-Critic Methods
1000.0 PPO Addressing Function Approximation Error in Actor-Critic Methods
1000.0 SAC Addressing Function Approximation Error in Actor-Critic Methods
1000.0 TD3 Addressing Function Approximation Error in Actor-Critic Methods
985.4 TRPO Addressing Function Approximation Error in Actor-Critic Methods
905.1 TRPO (MPI) OpenAI Baselines ea68f3b
809.43 PPO OpenAI Baselines ea68f3b