Overview

The objective of Asteroids is to destroy asteroids and saucers. The player controls a triangular ship that can rotate left and right, fire shots straight forward, and thrust forward.[3] Once the ship begins moving in a direction, it will continue in that direction for a time without player intervention unless the player applies thrust in a different direction. The ship eventually comes to a stop when not thrusting. The player can also send the ship into hyperspace, causing it to disappear and reappear in a random location on the screen, at the risk of self-destructing or appearing on top of an asteroid.[4]

Each level starts with a few large asteroids drifting in various directions on the screen. Objects wrap around screen edges – for instance, an asteroid that drifts off the top edge of the screen reappears at the bottom and continues moving in the same direction.[5] As the player shoots asteroids, they break into smaller asteroids that move faster and are more difficult to hit. Smaller asteroids are also worth more points. Two flying saucers appear periodically on the screen; the “big saucer” shoots randomly and poorly, while the “small saucer” fires frequently at the ship. After reaching a score of 40,000, only the small saucer appears. As the player’s score increases, the angle range of the shots from the small saucer diminishes until the saucer fires extremely accurately.[6] Once the screen has been cleared of all asteroids and flying saucers, a new set of large asteroids appears, thus starting the next level. The game gets harder as the number of asteroids increases until after the score reaches a range between 40,000 and 60,000.[7] The player starts with 3 lives after a coin is inserted and gains an extra life per 10,000 points.[8] When the player loses all their lives, the game ends.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
117303.4 ApeX DQN DQN Distributed Prioritized Experience Replay
36517.3 Human Human Massively Parallel Methods for Deep Reinforcement Learning
5093.1 A3C LSTM PG Asynchronous Methods for Deep Learning
4474.5 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
4078.1 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3009.4 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
2249.4 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2071.7 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2035.4 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1745.1 PERDDQN (rank) DQN Prioritized Experience Replay
1677.2 PERDQN (rank) DQN Prioritized Experience Replay
1654.0 PERDDQN (prop) DQN Prioritized Experience Replay
1458.7 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
1193.2 DDQN DQN Deep Reinforcement Learning with Double Q-learning
1021.9 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
933.63 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
877.1 Random Random Massively Parallel Methods for Deep Reinforcement Learning
871.3 Random Random Deep Reinforcement Learning with Double Q-learning
697.1 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
155495.1 ApeX DQN DQN Distributed Prioritized Experience Replay
86700.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
47388.7 Human Human Dueling Network Architectures for Deep Reinforcement Learning
34171.6 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
13157 Human Human Human-level control through deep reinforcement learning
4814.1 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
4541.0 NoisyNet-A3C PG Noisy Networks for Exploration
3917.6 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
3796.4 DQfD Imitation Deep Q-Learning from Demonstrations
3455.0 NoisyNet-DQN DQN Noisy Networks for Exploration
2869.3 DDQN+PopArt DQN Learning values across many orders of magnitude
2837.7 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2712.8 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2654.3 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
2654.3 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
2544.0 A3C PG Noisy Networks for Exploration
2354.7 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2220.0 DuelingDQN DQN Noisy Networks for Exploration
1824.0 DQN DQN Noisy Networks for Exploration
1699.3 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1629 DQN2015 DQN Human-level control through deep reinforcement learning
1516.0 C51 Misc A Distributional Perspective on Reinforcement Learning
1364.5 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
1192.7 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1047.66 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
930.6 DDQN DQN Deep Reinforcement Learning with Double Q-learning
907.3 Linear Misc Human-level control through deep reinforcement learning
734.7 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
719.1 Random Random Human-level control through deep reinforcement learning
89 Contingency Misc Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
2389.3 ACER PG Proximal Policy Optimization Algorithms
2097.5 PPO PG Proximal Policy Optimization Algorithms
1653.3 A2C PG Proximal Policy Optimization Algorithms