Atari Battle Zone Environment

Overview

Gameplay is on a flat plane with a mountainous horizon featuring an erupting volcano, distant crescent moon, and various geometric solids (in vector outline) like pyramids and blocks. The player views the screen, which includes an overhead radar view, to find and destroy the rather slow tanks, or the faster-moving supertanks. Saucer-shaped UFOs and guided missiles occasionally appear for a bonus score. The saucers differ from the tanks in that they do not fire upon the player and do not appear on radar. The player can hide behind the solids or, once fired upon, maneuver in rapid turns to buy time with which to fire.

The geometric solid obstacles are indestructible, and can block the movement of a player’s tank. However, they are also useful as shields as they block enemy fire as well.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
306500	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
52040.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
33030.0	Human	Massively Parallel Methods for Deep Reinforcement Learning
32250.0	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
31320.0	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
29100.0	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
25520.0	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
25240.0	DDQN	Deep Reinforcement Learning with Double Q-learning
24740.0	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
22250.0	Prioritized DQN (rank)	Prioritized Experience Replay
20760.0	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
19938.0	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
17560.0	DQN	Massively Parallel Methods for Deep Reinforcement Learning
12950.0	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
11340.0	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
3560.0	Random	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
98235.0	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
64070.0	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
62010.0	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
61220	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
52262	NoisyNet DuDQN	Noisy Networks for Exploration
42244	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
41145.0	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
40481	DuDQN	Noisy Networks for Exploration
39268	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
37800.0	Human	Human-level control through deep reinforcement learning
37187.5	Human	Dueling Network Architectures for Deep Reinforcement Learning
37150.0	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
36786	NoisyNet DQN	Noisy Networks for Exploration
35580	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
35520.0	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
31700.0	DDQN	A Distributional Perspective on Reinforcement Learning
29900.0	DQN	A Distributional Perspective on Reinforcement Learning
28981	DQN	Noisy Networks for Exploration
28742	C51	A Distributional Perspective on Reinforcement Learning
26300	DQN	Human-level control through deep reinforcement learning
25730.0	DDQN	Deep Reinforcement Learning with Double Q-learning
25266.66	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
20885.0	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
17871	NoisyNet A3C	Noisy Networks for Exploration
16411	A3C	Noisy Networks for Exploration
15820	Linear	Human-level control through deep reinforcement learning
13015.0	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
8910.0	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
7705.0	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
2360.0	Random	Human-level control through deep reinforcement learning
16.2	Contingency	Human-level control through deep reinforcement learning

Normal Starts

Result	Algorithm	Source
17366.7	PPO	Proximal Policy Optimization Algorithm
8983.3	ACER	Proximal Policy Optimization Algorithm
3080.0	A2C	Proximal Policy Optimization Algorithm

endtoend.ai