Atari Asteroids Environment

Overview

The objective of Asteroids is to destroy asteroids and saucers. The player controls a triangular ship that can rotate left and right, fire shots straight forward, and thrust forward.[3] Once the ship begins moving in a direction, it will continue in that direction for a time without player intervention unless the player applies thrust in a different direction. The ship eventually comes to a stop when not thrusting. The player can also send the ship into hyperspace, causing it to disappear and reappear in a random location on the screen, at the risk of self-destructing or appearing on top of an asteroid.[4]

Each level starts with a few large asteroids drifting in various directions on the screen. Objects wrap around screen edges – for instance, an asteroid that drifts off the top edge of the screen reappears at the bottom and continues moving in the same direction.[5] As the player shoots asteroids, they break into smaller asteroids that move faster and are more difficult to hit. Smaller asteroids are also worth more points. Two flying saucers appear periodically on the screen; the “big saucer” shoots randomly and poorly, while the “small saucer” fires frequently at the ship. After reaching a score of 40,000, only the small saucer appears. As the player’s score increases, the angle range of the shots from the small saucer diminishes until the saucer fires extremely accurately.[6] Once the screen has been cleared of all asteroids and flying saucers, a new set of large asteroids appears, thus starting the next level. The game gets harder as the number of asteroids increases until after the score reaches a range between 40,000 and 60,000.[7] The player starts with 3 lives after a coin is inserted and gains an extra life per 10,000 points.[8] When the player loses all their lives, the game ends.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result Algorithm Source
36517.3 Human Massively Parallel Methods for Deep Reinforcement Learning
5093.1 A3C LSTM Asynchronous Methods for Deep Reinforcement Learning
4474.5 A3C FF Asynchronous Methods for Deep Reinforcement Learning
3009.4 A3C FF 1 day Asynchronous Methods for Deep Reinforcement Learning
2249.4 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
2071.7 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2035.4 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
1745.1 Prioritized DDQN (rank, tuned) Prioritized Experience Replay
1677.2 Prioritized DQN (rank) Prioritized Experience Replay
1654.0 Prioritized DDQN (prop, tuned) Prioritized Experience Replay
1219.0 DDQN Deep Reinforcement Learning with Double Q-learning
1193.2 DDQN (tuned) Deep Reinforcement Learning with Double Q-learning
1021.9 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
933.63 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
871.3 Random Massively Parallel Methods for Deep Reinforcement Learning
697.1 DQN Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Algorithm Source
108590.05 IMPALA (deep) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
86700 NoisyNet DuDQN Noisy Networks for Exploration
47388.7 Human Dueling Network Architectures for Deep Reinforcement Learning
34171.6 ACKTR Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
13156.7 Human Human-level control through deep reinforcement learning
4541 NoisyNet A3C Noisy Networks for Exploration
4467.4 Reactor ND The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
4226 QR-DQN-1 Distributional Reinforcement Learning with Quantile Regression
3726.1 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
3508.1 IMPALA (shallow) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
3455 NoisyNet DQN Noisy Networks for Exploration
2898 IQN Implicit Quantile Networks for Distributional Reinforcement Learning
2837.7 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
2780.4 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
2712.8 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
2544 A3C Noisy Networks for Exploration
2354.7 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2335 QR-DQN-0 Distributional Reinforcement Learning with Quantile Regression
2220 DuDQN Noisy Networks for Exploration
2011.05 IMPALA (deep, multitask) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
1824 DQN Noisy Networks for Exploration
1629 DQN Human-level control through deep reinforcement learning
1516 C51 A Distributional Perspective on Reinforcement Learning
1364.5 DQN A Distributional Perspective on Reinforcement Learning
1192.7 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
1047.66 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
930.6 DDQN Deep Reinforcement Learning with Double Q-learning
907.3 Linear Human-level control through deep reinforcement learning
734.7 DDQN A Distributional Perspective on Reinforcement Learning
719.1 Random Human-level control through deep reinforcement learning
89 Contingency Human-level control through deep reinforcement learning

Normal Starts

Result Algorithm Source
2389.3 ACER Proximal Policy Optimization Algorithm
2097.5 PPO Proximal Policy Optimization Algorithm
1653.3 A2C Proximal Policy Optimization Algorithm
1070 DQN Ours Deep Recurrent Q-Learning for Partially Observable MDPs
1032 DRQN Deep Recurrent Q-Learning for Partially Observable MDPs
1020 DRQN Deep Recurrent Q-Learning for Partially Observable MDPs
1010 DQN Ours Deep Recurrent Q-Learning for Partially Observable MDPs