MuJoCo Walker2D Environment

Overview

Make a two-dimensional bipedal robot walk forward as fast as possible.

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Result	Algorithm	Source
6874.1	TRPO	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
6198.8	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
6028.73	TRPO+GAE	Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
5874.9	A2C	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
5027.2	Trust-PCL	Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
4682.82	TD3	Addressing Function Approximation Error in Actor-Critic Methods
3424.95	PPO	OpenAI Baselines ea68f3b
3317.69	PPO	Addressing Function Approximation Error in Actor-Critic Methods
3098.11	Our DDPG	Addressing Function Approximation Error in Actor-Critic Methods
2838.4	TRPO	Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
2342.63	TRPO (MPI)	OpenAI Baselines ea68f3b
2321.47	TRPO	Addressing Function Approximation Error in Actor-Critic Methods
1843.85	DDPG	Addressing Function Approximation Error in Actor-Critic Methods
1283.67	SAC	Addressing Function Approximation Error in Actor-Critic Methods
1216.7	ACKTR	Addressing Function Approximation Error in Actor-Critic Methods

endtoend.ai

MuJoCo Walker2D Environment

Overview

Performances of RL Agents

Explore →