Overview

In Fishing Derby, two fishermen sit on opposite docks over a lake filled with fish (and a shark that passes through). Using the joystick the player is able to move his line left right and up and down in the water. When a fish is hooked, the line slowly comes up to the surface of the water. Pressing the fire button on the joystick reels in the fish faster. However, if both fishermen have fish hooked, only one person can reel theirs in (the one who hooked theirs first). The shark that roams the water will try to eat hooked fish before they surface.

The objective for both fishermen is to reach 99 pounds of fish first. There are six rows of fish; the top two rows have 2 lb. fish, the middle two rows have 4 lb. fish, and the two bottom rows have 6 lb. fish. The more valuable fish sit at the bottom, but they are harder to bring in as they run a higher risk of being eaten by the shark.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
22.6 A3C LSTM PG Asynchronous Methods for Deep Learning
22.6 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
22.4 ApeX DQN DQN Distributed Prioritized Experience Replay
18.8 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
17.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
13.6 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
9.8 PERDDQN (rank) DQN Prioritized Experience Replay
9.2 PERDDQN (prop) DQN Prioritized Experience Replay
5.1 Human Human Massively Parallel Methods for Deep Reinforcement Learning
4.64 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
3.5 PERDQN (rank) DQN Prioritized Experience Replay
3.2 DDQN DQN Deep Reinforcement Learning with Double Q-learning
-2.3 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
-3.7 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
-4.1 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
-4.9 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
-77.1 Random Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
57.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
46.4 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
45.1 DDQN+PopArt DQN Learning values across many orders of magnitude
44.4 ApeX DQN DQN Distributed Prioritized Experience Replay
41.3 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
39.5 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
39.5 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
38.4 DQfD Imitation Deep Q-Learning from Demonstrations
35.0 DuelingDQN DQN Noisy Networks for Exploration
33.73 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
31.3 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
30.2 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
28.6 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
20.3 DDQN DQN Deep Reinforcement Learning with Double Q-learning
20.19 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
15.5 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
11.0 NoisyNet-DQN DQN Noisy Networks for Exploration
9.1 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
8.9 C51 Misc A Distributional Perspective on Reinforcement Learning
7.7 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
5.5 Human Human Human-level control through deep reinforcement learning
4.0 DQN DQN Noisy Networks for Exploration
-0.8 DQN2015 DQN Human-level control through deep reinforcement learning
-4.9 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
-7.0 A3C PG Noisy Networks for Exploration
-38.0 NoisyNet-A3C PG Noisy Networks for Exploration
-38.7 Human Human Dueling Network Architectures for Deep Reinforcement Learning
-85.1 Contingency Misc Human-level control through deep reinforcement learning
-89.5 Linear Misc Human-level control through deep reinforcement learning
-91.7 Random Random Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
34.7 ACER PG Proximal Policy Optimization Algorithms
20.6 A2C PG Proximal Policy Optimization Algorithms
17.8 PPO PG Proximal Policy Optimization Algorithms