Atari Bank Heist Environment

Overview

Bank Heist is a maze video game developed by 20th Century Fox for the Atari 2600.

Each level in Bank Heist is a maze-like city (similar to Pac-Man). The objective of the game is to rob as many banks as possible while avoiding the police. The player controls a car called the Getaway Car. The car has a limited amount of fuel, which can be refilled by changing cities. Robbing a bank will cause a cop car to appear, as well as another bank. Up to three cars can be present in a city at a time. Cars can be destroyed by dropping dynamite out the tail pipe of the Getaway Car (however, dynamite can also destroy the Getaway Car). The player starts out with four spare cars (lives). Lives are lost by running out of fuel, being hit by dynamite, or hitting a cop car. If the player can rob nine banks in one city, an extra car is earned.

The left and right difficulty switches alter how hard the game is. When the left difficulty switch is set to A, the cop cars are smarter in catching the Getaway Car; when it’s set to B, enemy cars move in a more set pattern. When the right difficulty switch is set to A, the banks appear in random spots; when the switch is set to B, the banks appear in preset locations.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
1129.3	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
1004.6	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
970.1	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
946.0	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
932.8	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
886.0	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
876.6	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
835.6	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
826.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
823.7	Prioritized DQN (rank)	Prioritized Experience Replay
816.8	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
644.5	Human	Massively Parallel Methods for Deep Reinforcement Learning
469.8	DDQN	Deep Reinforcement Learning with Double Q-learning
399.42	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
176.3	DQN	Massively Parallel Methods for Deep Reinforcement Learning
21.7	Random	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
1611.9	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
1503.1	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
1428	DuDQN	Noisy Networks for Exploration
1416	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
1358.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
1318	NoisyNet DuDQN	Noisy Networks for Exploration
1296	A3C	Noisy Networks for Exploration
1289.7	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
1259.7	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
1249	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
1245	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
1236.8	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
1223.15	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
1200.35	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
1068	NoisyNet DQN	Noisy Networks for Exploration
1056.7	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
1033	NoisyNet A3C	Noisy Networks for Exploration
1030.6	DDQN	A Distributional Perspective on Reinforcement Learning
988.7	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
976	C51	A Distributional Perspective on Reinforcement Learning
753.1	Human	Dueling Network Architectures for Deep Reinforcement Learning
734.4	Human	Human-level control through deep reinforcement learning
728.3	DDQN	Deep Reinforcement Learning with Double Q-learning
609.0	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
455.0	DQN	A Distributional Perspective on Reinforcement Learning
455	DQN	Noisy Networks for Exploration
429.7	DQN	Human-level control through deep reinforcement learning
190.8	Linear	Human-level control through deep reinforcement learning
67.4	Contingency	Human-level control through deep reinforcement learning
55.15	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
14.2	Random	Human-level control through deep reinforcement learning

Normal Starts

Result	Algorithm	Source
1280.6	PPO	Proximal Policy Optimization Algorithm
1177.5	ACER	Proximal Policy Optimization Algorithm
1095.3	A2C	Proximal Policy Optimization Algorithm

endtoend.ai