Overview

One or two players control chickens who can be made to run across a ten lane highway filled with traffic in an effort to “get to the other side.” Every time a chicken gets across a point is earned for that player. If hit by a car, a chicken is forced back either slightly, or pushed back to the bottom of the screen, depending on what difficulty the switch is set to. The winner of a two player game is the player who has scored the most points in the two minutes, sixteen seconds allotted. The chickens are only allowed to move up or down. A cluck sound is heard when a chicken is struck by a car.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
29.1 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
29.0 ApeX DQN DQN Distributed Prioritized Experience Replay
28.9 PERDDQN (rank) DQN Prioritized Experience Replay
28.8 DDQN DQN Deep Reinforcement Learning with Double Q-learning
28.8 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
28.4 PERDQN (rank) DQN Prioritized Experience Replay
28.2 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
27.9 PERDDQN (prop) DQN Prioritized Experience Replay
27.1 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
26.9 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
25.8 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
25.6 Human Human Massively Parallel Methods for Deep Reinforcement Learning
10.16 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
0.2 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
0.2 Random Random Massively Parallel Methods for Deep Reinforcement Learning
0.1 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
0.1 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
0.1 A3C LSTM PG Asynchronous Methods for Deep Learning

No-op Starts

Result Method Type Score from
34.0 DuelingDQN DQN Noisy Networks for Exploration
34.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
34.0 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
33.9 C51 Misc A Distributional Perspective on Reinforcement Learning
33.7 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
33.7 ApeX DQN DQN Distributed Prioritized Experience Replay
33.7 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
33.6 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
33.4 DDQN+PopArt DQN Learning values across many orders of magnitude
33.3 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
33.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
32.9 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
32.0 NoisyNet-DQN DQN Noisy Networks for Exploration
32.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
31.9 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
31.8 DDQN DQN Deep Reinforcement Learning with Double Q-learning
31.4 DQfD Imitation Deep Q-Learning from Demonstrations
31.0 DQN DQN Noisy Networks for Exploration
30.3 DQN2015 DQN Human-level control through deep reinforcement learning
29.6 Human Human Human-level control through deep reinforcement learning
19.7 Contingency Misc Human-level control through deep reinforcement learning
19.1 Linear Misc Human-level control through deep reinforcement learning
18.0 NoisyNet-A3C PG Noisy Networks for Exploration
11.69 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
0.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
0.0 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
0 Random Random Human-level control through deep reinforcement learning
0.0 A3C PG Noisy Networks for Exploration

Normal Starts

Result Method Type Score from
32.5 PPO PG Proximal Policy Optimization Algorithms
0.0 A2C PG Proximal Policy Optimization Algorithms
0.0 ACER PG Proximal Policy Optimization Algorithms