Atari Crazy Climber Environment

Overview

In Crazy Climber the player assumes the role of a person attempting to climb to the top of four skyscrapers. The climber is controlled via two joysticks.. There are a number of obstacles and dangers to avoid including:

Windows that open and close (the most common danger).
Bald-headed residents (a.k.a. Mad Doctors), who throw objects such as flower pots, buckets of water and fruit in an effort to knock the climber off the building (with larger objects appearing by more aggressive Mad Doctors in later levels).
A giant condor, who drops eggs and excrement aimed at the climber (two at a time in the early stages, four in later levels).
A giant ape (styled like King Kong), whose punch can prove deadly (he becomes more aggressive in later levels).
Falling steel girders and iron dumbbells (more numerous in the later levels).
Live wires, which protrude off electric signs.
Falling ‘Crazy Climber’ signs.

Some of these dangers appear at every level of the game; others make appearances only in later stages. Should the climber succumb to any one of these dangers, a new climber takes his place at the exact point where he fell; the last major danger is eliminated.

One ally the climber has is a pink “Lucky Balloon”; if he is able to grab it, the climber is transported up 8 stories to a window. The window onto which it drops the climber may be about to close. If the window that the climber is dropped onto is fully closed, the balloon pauses there until the window opens up again. The player does not actually earn bonus points for catching the balloon, but he is awarded the normal ‘step value’ for each of the eight floors that he passes while holding the balloon.

If the climber is able to ascend to the top of a skyscraper and grabs the runner of a waiting helicopter, he earns a bonus and is transported to another skyscraper, which presents more dangers than the past. The helicopter would only wait about 30 seconds, then fly off.

If the player completes all four skyscrapers, he is taken back to the first skyscraper and the game restarts from the beginning, but the player keeps his score.

The difficulty level of any game was modified to take into account the skill of previous players. Hence if a player pushed the high score up to say 250,000 (needed a really good player), any novice player following would get thoroughly wiped out for several games due to the increased difficulty level, and would have to play until it dropped back down.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
154416.5	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
143962.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
138518.0	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
131086.0	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
127853.0	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
127512.0	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
124566.0	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
113782.0	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
112646.0	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
109337.0	Prioritized DQN (rank)	Prioritized Experience Replay
101624.0	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
94315.0	DDQN	Deep Reinforcement Learning with Double Q-learning
65451.0	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
50992.0	DQN	Massively Parallel Methods for Deep Reinforcement Learning
32667.0	Human	Massively Parallel Methods for Deep Reinforcement Learning
9337.0	Random	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
236422.0	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
194347	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
181233	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
179877	C51	A Distributional Perspective on Reinforcement Learning
179082	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
178355.0	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
173274.0	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
171171	NoisyNet DuDQN	Noisy Networks for Exploration
168788.5	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
163335	DuDQN	Noisy Networks for Exploration
162224.0	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
161196	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
150444.0	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
143570.0	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
139950	NoisyNet A3C	Noisy Networks for Exploration
136950.0	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
136211.5	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
134783	A3C	Noisy Networks for Exploration
118305	NoisyNet DQN	Noisy Networks for Exploration
117282.0	DDQN	A Distributional Perspective on Reinforcement Learning
116480	DQN	Noisy Networks for Exploration
115384.0	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
114103	DQN	Human-level control through deep reinforcement learning
110763.0	DQN	A Distributional Perspective on Reinforcement Learning
101874.0	DDQN	Deep Reinforcement Learning with Double Q-learning
85919.16	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
35829.4	Human	Dueling Network Architectures for Deep Reinforcement Learning
35410.5	Human	Human-level control through deep reinforcement learning
23411	Linear	Human-level control through deep reinforcement learning
10780.5	Random	Human-level control through deep reinforcement learning
149.8	Contingency	Human-level control through deep reinforcement learning

Normal Starts

Result	Algorithm	Source
132461.0	ACER	Proximal Policy Optimization Algorithm
110202.0	PPO	Proximal Policy Optimization Algorithm
107770.0	A2C	Proximal Policy Optimization Algorithm

endtoend.ai