Overview

As in Pac-Man, the player is opposed by enemies who kill on contact. The enemies gradually increase in number as the player advances from one level to the next, and their speed also increases. On odd-numbered levels, the player controls an ape (in some versions labeled “Copier”), and must collect coconuts while avoiding headhunters (labeled “Police” and “Thief”). On even-numbered levels, the player controls a paint roller (labeled “Rustler”), and must paint over each spot of the board while avoiding pigs (labeled “Cattle” and “Thief”). Each level is followed by a short bonus stage.

Whenever a rectangular portion of the board is cleared (either by collecting all surrounding coconuts, or painting all surrounding edges), the rectangle is colored in, and in the even levels, bonus points are awarded (In odd-numbered levels, the player collects points for each coconut eaten). When the player clears all four corners of the board, he is briefly empowered to kill the enemies by touching them (just as when Pac-Man uses a “power pill”). Enemies killed in this way fall to the bottom of the screen and revitalise themselves after a few moments.

The game controls consist of a joystick and a single button labeled “Jump,” which can be used up to three times, resetting after a level is cleared or the player loses a life. Pressing the jump button does not cause the player to jump, but causes all the enemies to jump, enabling the player to walk under them.

Extra lives are given at 50,000 points, and per 80,000 scored up to 930,000; after that, no more lives.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
1540.4 Human Human Massively Parallel Methods for Deep Reinforcement Learning
1047.3 ApeX DQN DQN Distributed Prioritized Experience Replay
283.9 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
263.9 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
238.4 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
237.7 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
218.4 PERDDQN (prop) DQN Prioritized Experience Replay
202.8 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
189.15 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
178.4 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
173.0 A3C LSTM PG Asynchronous Methods for Deep Learning
172.7 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
169.1 DDQN DQN Prioritized Experience Replay
159.1 DDQN DQN Deep Reinforcement Learning with Double Q-learning
148.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
133.4 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
129.1 PERDDQN (rank) DQN Prioritized Experience Replay
98.9 PERDQN (rank) DQN Prioritized Experience Replay
11.8 Random Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
8659.2 ApeX DQN DQN Distributed Prioritized Experience Replay
5131.2 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3537.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
2354.5 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2325.0 DQfD Imitation Deep Q-Learning from Demonstrations
2296.8 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2296.0 DuelingDQN DQN Noisy Networks for Exploration
2140.4 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
2051.8 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1838.9 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
1838.9 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
1793.3 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1735.0 C51 Misc A Distributional Perspective on Reinforcement Learning
1719.5 Human Human Dueling Network Architectures for Deep Reinforcement Learning
1676 Human Human Human-level control through deep reinforcement learning
1610.0 NoisyNet-DQN DQN Noisy Networks for Exploration
1608.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1267.9 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1189.7 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
1059.4 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
978.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
924.0 DQN DQN Noisy Networks for Exploration
904.0 A3C PG Noisy Networks for Exploration
782.5 DDQN+PopArt DQN Learning values across many orders of magnitude
739.5 DQN2015 DQN Human-level control through deep reinforcement learning
702.1 DDQN DQN Deep Reinforcement Learning with Double Q-learning
491.0 NoisyNet-A3C PG Noisy Networks for Exploration
183.6 Contingency Misc Human-level control through deep reinforcement learning
103.4 Linear Misc Human-level control through deep reinforcement learning
5.8 Random Random Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
827.6 ACER PG Proximal Policy Optimization Algorithms
674.6 PPO PG Proximal Policy Optimization Algorithms
380.8 A2C PG Proximal Policy Optimization Algorithms