Overview

The player controls the titular character of James Bond across four levels. The player is given a multi-purpose vehicle that acts as an automobile, a plane, and a submarine. The vehicle can fire shots and flare bombs, and travels from left to right as the player progresses through each level. The player can shoot or avoid enemies and obstacles that appear throughout the game, including boats, frogmen, helicopters, missiles, and mini-submarines.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
18992.3 ApeX DQN DQN Distributed Prioritized Experience Replay
3961.0 PERDDQN (rank) DQN Prioritized Experience Replay
3511.5 PERDDQN (prop) DQN Prioritized Experience Replay
1074.5 PERDQN (rank) DQN Prioritized Experience Replay
835.5 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
697.5 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
613.0 A3C LSTM PG Asynchronous Methods for Deep Learning
585.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
573.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
541.0 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
444.0 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
368.5 Human Human Massively Parallel Methods for Deep Reinforcement Learning
351.5 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
348.5 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
33.5 Random Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
21322.5 ApeX DQN DQN Distributed Prioritized Experience Replay
5148.0 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
5148.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
4682.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
2095.0 DQfD Imitation Deep Q-Learning from Demonstrations
1909.0 C51 Misc A Distributional Perspective on Reinforcement Learning
1667.0 DuelingDQN DQN Noisy Networks for Exploration
1358.0 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1312.5 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1235.0 NoisyNet-DQN DQN Noisy Networks for Exploration
909.0 DQN DQN Noisy Networks for Exploration
812.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
768.5 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
639.6 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
605.0 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
576.7 DQN2015 DQN Human-level control through deep reinforcement learning
509.0 A3C PG Noisy Networks for Exploration
507.5 DDQN+PopArt DQN Learning values across many orders of magnitude
490.0 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
438.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
406.7 Human Human Human-level control through deep reinforcement learning
354.1 Contingency Misc Human-level control through deep reinforcement learning
302.8 Human Human Dueling Network Architectures for Deep Reinforcement Learning
202.8 Linear Misc Human-level control through deep reinforcement learning
188.0 NoisyNet-A3C PG Noisy Networks for Exploration
29 Random Random Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
560.7 PPO PG Proximal Policy Optimization Algorithms
261.8 ACER PG Proximal Policy Optimization Algorithms
52.3 A2C PG Proximal Policy Optimization Algorithms