Overview

n Private Eye, players assume the role of Pierre Touché, a private investigator who has been assigned the task of capturing the criminal mastermind, Henri Le Fiend. Le Fiend is implicated in a number of crimes across the city, and the player must find the clues and the stolen property in order to successfully arrest Le Fiend.

The game consists of four separate cases. Using a specially-built Model A that can jump over obstacles, players must search the city for a specific clue to the crime and for the object stolen in the crime. Each item must then be returned to its point of origin; the clue is taken to a business to verify it came from there, and the stolen object is returned to its rightful owner. These items may be discovered in any order, but players may carry only one item at a time. When both items have been located and returned, then the player must locate and capture Le Fiend, and finally take him to jail, successfully closing the case.

However, the city is full of street thugs who will attack the player. If the player is hit while carrying an item (either the clue or the stolen property), the item is lost and must be re-located. Further, each case has a statute of limitations, which serves as the game’s time limit. To win the game, the player must locate and verify the clue, locate and return the stolen property, and lastly locate Le Fiend and take him to jail within the time allotted.

The player starts with 1000 “merit points”. Points are lost whenever the player hits an obstacle or is attacked by a thug, and are awarded whenever an item is located, subsequently returned, and when a thug (or Le Fiend himself) is nabbed. Each case represents a separate game variation; when the case is solved or time runs out, the game ends. A fifth variation requires the player to solve all four crimes at the same time.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
64169.1 Human Human Massively Parallel Methods for Deep Reinforcement Learning
5955.4 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
5717.5 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2598.55 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
2202.3 PERDQN (rank) DQN Prioritized Experience Replay
1704.4 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1277.6 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
864.7 ApeX DQN DQN Distributed Prioritized Experience Replay
670.7 PERDDQN (rank) DQN Prioritized Experience Replay
662.8 Random Random Massively Parallel Methods for Deep Reinforcement Learning
421.1 A3C LSTM PG Asynchronous Methods for Deep Learning
298.2 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
292.6 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
207.9 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
206.9 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
194.4 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
179.0 PERDDQN (prop) DQN Prioritized Experience Replay
-575.5 DDQN DQN Deep Reinforcement Learning with Double Q-learning

No-op Starts

Result Method Type Score from
98763.2 YouTube Imitation Playing hard exploration games by watching YouTube
98212.5 YouTube (imitation only) Imitation Playing hard exploration games by watching YouTube
69571 Human Human Human-level control through deep reinforcement learning
42457.2 DQfD Imitation Deep Q-Learning from Demonstrations
15172.9 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
15095.0 C51 Misc A Distributional Perspective on Reinforcement Learning
4234.0 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3966.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3781.0 A3C PG Noisy Networks for Exploration
3712.0 NoisyNet-DQN DQN Noisy Networks for Exploration
2361.0 DQN DQN Noisy Networks for Exploration
1788 DQN2015 DQN Human-level control through deep reinforcement learning
748.6 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
684.3 Linear Misc Human-level control through deep reinforcement learning
670.1 DDQN DQN Deep Reinforcement Learning with Double Q-learning
286.7 DDQN+PopArt DQN Learning values across many orders of magnitude
279.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
227.0 DuelingDQN DQN Noisy Networks for Exploration
206.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
200.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
200.0 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
154.3 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
146.7 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
129.7 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
103.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
100.0 NoisyNet-A3C PG Noisy Networks for Exploration
86 Contingency Misc Human-level control through deep reinforcement learning
49.8 ApeX DQN DQN Distributed Prioritized Experience Replay
49.8 ApeX DQN Playing hard exploration games by watching YouTube
24.9 Random Random Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
182.0 ACER PG Proximal Policy Optimization Algorithms
91.3 A2C PG Proximal Policy Optimization Algorithms
69.5 PPO PG Proximal Policy Optimization Algorithms