Overview

In Chopper Command the player controls a military helicopter in a desert scenario protecting a convoy of trucks. The goal is to destroy all enemy fighter jets and helicopters that attack the player’s helicopter and the friendly trucks traveling below, ending the current wave. The game ends when the player loses all of his or her lives or reaches 999,999 points. A radar, called a Long Range Scanner in the instruction manual, shows all enemies, including those not visible on the main screen.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
576601.5 ApeX DQN DQN Distributed Prioritized Experience Replay
10916.0 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
10150.0 A3C LSTM PG Asynchronous Methods for Deep Learning
9600.5 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
8930.0 Human Human Massively Parallel Methods for Deep Reinforcement Learning
8778.5 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
8058.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
7021.0 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
6685.0 PERDQN (rank) DQN Prioritized Experience Replay
6604.0 PERDDQN (prop) DQN Prioritized Experience Replay
5017.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
4669.0 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
4635.0 PERDDQN (rank) DQN Prioritized Experience Replay
3784.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
3495.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
3191.75 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
3046.0 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
644.0 Random Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
721851.0 ApeX DQN DQN Distributed Prioritized Experience Replay
16654.0 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
15600.0 C51 Misc A Distributional Perspective on Reinforcement Learning
13185.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
13136.0 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
11477.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
11215.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
9882 Human Human Human-level control through deep reinforcement learning
9519.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
8893.0 NoisyNet-DQN DQN Noisy Networks for Exploration
8600.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
7561.0 NoisyNet-A3C PG Noisy Networks for Exploration
7388.0 DuelingDQN DQN Noisy Networks for Exploration
7387.8 Human Human Dueling Network Architectures for Deep Reinforcement Learning
7271.0 DQN DQN Noisy Networks for Exploration
6993.1 DQfD Imitation Deep Q-Learning from Demonstrations
6973.8 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
6687 DQN2015 DQN Human-level control through deep reinforcement learning
6126.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
5809.0 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
5285.0 A3C PG Noisy Networks for Exploration
5135.0 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
4653.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
4167.5 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
1582 Linear Misc Human-level control through deep reinforcement learning
811 Random Random Human-level control through deep reinforcement learning
775.0 DDQN+PopArt DQN Learning values across many orders of magnitude
16.9 Contingency Misc Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
5287.7 ACER PG Proximal Policy Optimization Algorithms
3516.3 PPO PG Proximal Policy Optimization Algorithms
1171.7 A2C PG Proximal Policy Optimization Algorithms