Atari Pit Fall Environment

Overview

The player controls the character (Pitfall Harry) through a maze-like jungle in an attempt to recover 32 treasures in a 20-minute time period. Along the way, players must maneuver through numerous hazards, including pits, quicksand, rolling logs, fire, rattlesnakes, scorpions, and crocodiles. Harry may jump over or otherwise avoid these obstacles by climbing, running, or swinging on vines. Treasure includes bags of money, gold and silver bars, and diamond rings, which range in value from 2000 to 5000 points in 1000 point increments. There are eight of each treasure type, with 32 in total. A perfect score of 114,000 is achieved by claiming all 32 treasures without losing any points. Points are deducted by either falling in a hole (100 points) or touching logs; point loss depends on how long contact is made with the log. Under the jungle there is a tunnel which Harry can access through ladders found at various points. Traveling though the tunnel moves forward three screens at a time, which is necessary in order to collect all the treasures within the time limit. However, the tunnels are filled with dead-ends blocked by brick walls, forcing the player to return to the surface at one of the ladders, and try to find a way around again, thus wasting time. The tunnels also contain scorpions. The player loses a life if Harry comes in contact with any obstacle (except logs) or falls into a tar pit, quicksand, waterhole, or mouth of a crocodile. The game ends when either all 32 treasures have been collected, all three lives have been lost, or the time has run out.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
5998.9	Human	Deep Reinforcement Learning with Double Q-learning
5998.9	Human	Dueling Network Architectures for Deep Reinforcement Learning
-14.8	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
-37.6	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
-46.9	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
-78.5	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
-113.2	DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
-123.0	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
-135.7	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
-186.7	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
-193.7	Prioritized DQN (rank)	Prioritized Experience Replay
-243.6	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
-342.8	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
-348.8	Random	Deep Reinforcement Learning with Double Q-learning
-427.0	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
-432.9	DDQN	Deep Reinforcement Learning with Double Q-learning

No-op Starts

Result	Algorithm	Source
6463.7	Human	Dueling Network Architectures for Deep Reinforcement Learning
0.0	C51	A Distributional Perspective on Reinforcement Learning
0.0	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
0.0	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
0	DQN	Noisy Networks for Exploration
0	NoisyNet DQN	Noisy Networks for Exploration
0	A3C	Noisy Networks for Exploration
0	NoisyNet A3C	Noisy Networks for Exploration
0	DuDQN	Noisy Networks for Exploration
0	NoisyNet DuDQN	Noisy Networks for Exploration
0.0	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
0.0	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
0.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
0.0	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
-1.1	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
-1.22	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-1.66	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-2.1	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
-3.5	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
-3.7	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
-8.9	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
-11.14	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-29.9	DDQN	A Distributional Perspective on Reinforcement Learning
-229.4	Random	Dueling Network Architectures for Deep Reinforcement Learning
-286.1	DQN	A Distributional Perspective on Reinforcement Learning
-286.1	DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning

Normal Starts

Result	Algorithm	Source
0	Dynamics	Exploration by Random Network Distillation
0	PPO	Exploration by Random Network Distillation
-3	RND	Exploration by Random Network Distillation
-16.9	ACER	Proximal Policy Optimization Algorithm
-32.9	PPO	Proximal Policy Optimization Algorithm
-55.0	A2C	Proximal Policy Optimization Algorithm

endtoend.ai