Atari Asteroids Environment

Overview

The objective of Asteroids is to destroy asteroids and saucers. The player controls a triangular ship that can rotate left and right, fire shots straight forward, and thrust forward.[3] Once the ship begins moving in a direction, it will continue in that direction for a time without player intervention unless the player applies thrust in a different direction. The ship eventually comes to a stop when not thrusting. The player can also send the ship into hyperspace, causing it to disappear and reappear in a random location on the screen, at the risk of self-destructing or appearing on top of an asteroid.[4]

Each level starts with a few large asteroids drifting in various directions on the screen. Objects wrap around screen edges – for instance, an asteroid that drifts off the top edge of the screen reappears at the bottom and continues moving in the same direction.[5] As the player shoots asteroids, they break into smaller asteroids that move faster and are more difficult to hit. Smaller asteroids are also worth more points. Two flying saucers appear periodically on the screen; the “big saucer” shoots randomly and poorly, while the “small saucer” fires frequently at the ship. After reaching a score of 40,000, only the small saucer appears. As the player’s score increases, the angle range of the shots from the small saucer diminishes until the saucer fires extremely accurately.[6] Once the screen has been cleared of all asteroids and flying saucers, a new set of large asteroids appears, thus starting the next level. The game gets harder as the number of asteroids increases until after the score reaches a range between 40,000 and 60,000.[7] The player starts with 3 lives after a coin is inserted and gains an extra life per 10,000 points.[8] When the player loses all their lives, the game ends.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
36517.3	Human	Massively Parallel Methods for Deep Reinforcement Learning
5093.1	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
4474.5	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
3009.4	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
2249.4	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
2071.7	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
2035.4	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
1745.1	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
1677.2	Prioritized DQN (rank)	Prioritized Experience Replay
1654.0	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
1219.0	DDQN	Deep Reinforcement Learning with Double Q-learning
1193.2	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
1021.9	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
933.63	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
871.3	Random	Massively Parallel Methods for Deep Reinforcement Learning
697.1	DQN	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
108590.05	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
86700	NoisyNet DuDQN	Noisy Networks for Exploration
47388.7	Human	Dueling Network Architectures for Deep Reinforcement Learning
34171.6	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
13156.7	Human	Human-level control through deep reinforcement learning
4541	NoisyNet A3C	Noisy Networks for Exploration
4467.4	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
4226	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
3726.1	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
3508.1	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
3455	NoisyNet DQN	Noisy Networks for Exploration
2898	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
2837.7	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
2780.4	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
2712.8	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
2544	A3C	Noisy Networks for Exploration
2354.7	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
2335	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
2220	DuDQN	Noisy Networks for Exploration
2011.05	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
1824	DQN	Noisy Networks for Exploration
1629	DQN	Human-level control through deep reinforcement learning
1516	C51	A Distributional Perspective on Reinforcement Learning
1364.5	DQN	A Distributional Perspective on Reinforcement Learning
1192.7	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
1047.66	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
930.6	DDQN	Deep Reinforcement Learning with Double Q-learning
907.3	Linear	Human-level control through deep reinforcement learning
734.7	DDQN	A Distributional Perspective on Reinforcement Learning
719.1	Random	Human-level control through deep reinforcement learning
89	Contingency	Human-level control through deep reinforcement learning

Normal Starts

Result	Algorithm	Source
2389.3	ACER	Proximal Policy Optimization Algorithm
2097.5	PPO	Proximal Policy Optimization Algorithm
1653.3	A2C	Proximal Policy Optimization Algorithm
1070	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs
1032	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
1020	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
1010	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs

endtoend.ai