Overview

nduro consists of maneuvering a race car in the National Enduro, a long-distance endurance race. The object of the race is to pass a certain number of cars each day. Doing so will allow the player to continue racing for the next day. The driver must avoid other racers and pass 200 cars on the first day, and 300 cars with each following day.

As the time in the game passes, visibility changes as well. When it is night in the game the player can only see the oncoming cars’ taillights. As the days progress, cars will become more difficult to avoid as well. Weather and time of day are factors in how to play. During the day the player may drive through an icy patch on the road which would limit control of the vehicle, or a patch of fog may reduce visibility.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
2223.9 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2133.4 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2077.4 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2061.1 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2042.4 ApeX DQN DQN Distributed Prioritized Experience Replay
1884.4 PERDDQN (prop) DQN Prioritized Experience Replay
1831.0 PERDDQN (rank) DQN Prioritized Experience Replay
1265.6 PERDQN (rank) DQN Prioritized Experience Replay
1216.6 DDQN DQN Deep Reinforcement Learning with Double Q-learning
1021.5 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
740.2 Human Human Massively Parallel Methods for Deep Reinforcement Learning
626.7 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
475.6 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
71.04 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
-81.8 Random Random Massively Parallel Methods for Deep Reinforcement Learning
-82.2 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
-82.5 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
-82.5 A3C LSTM PG Asynchronous Methods for Deep Learning

No-op Starts

Result Method Type Score from
3454.0 C51 Misc A Distributional Perspective on Reinforcement Learning
2306.4 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2259.3 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2258.2 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2199.6 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
2177.4 ApeX DQN DQN Distributed Prioritized Experience Replay
2155.0 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2125.9 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2093.0 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
2093.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
2064.0 DuelingDQN DQN Noisy Networks for Exploration
2013.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
2002.1 DDQN+PopArt DQN Learning values across many orders of magnitude
1929.8 DQfD Imitation Deep Q-Learning from Demonstrations
1240.0 NoisyNet-DQN DQN Noisy Networks for Exploration
1211.8 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1129.2 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
860.5 Human Human Dueling Network Architectures for Deep Reinforcement Learning
835.0 DQN DQN Noisy Networks for Exploration
729.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
319.5 DDQN DQN Deep Reinforcement Learning with Double Q-learning
309.6 Human Human Human-level control through deep reinforcement learning
301.8 DQN2015 DQN Human-level control through deep reinforcement learning
300.0 NoisyNet-A3C PG Noisy Networks for Exploration
159.4 Contingency Misc Human-level control through deep reinforcement learning
129.1 Linear Misc Human-level control through deep reinforcement learning
114.9 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
0.0 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
0 Random Random Human-level control through deep reinforcement learning
0.0 A3C PG Noisy Networks for Exploration

Normal Starts

Result Method Type Score from
758.3 PPO PG Proximal Policy Optimization Algorithms
534.6 TRPO (single path) PG Trust Region Policy Optimization
470 DQN2013 DQN Playing Atari with Deep Reinforcement Learning
430.8 TRPO (vine) PG Trust Region Policy Optimization
368 Human Human Playing Atari with Deep Reinforcement Learning
159 Contingency Misc Playing Atari with Deep Reinforcement Learning
129 Sarsa Misc Playing Atari with Deep Reinforcement Learning
0 Random Random Playing Atari with Deep Reinforcement Learning
0.0 A2C PG Proximal Policy Optimization Algorithms
0.0 ACER PG Proximal Policy Optimization Algorithms