Overview

The player controls Thomas with a four-way joystick and two attack buttons for punching and kick. Unlike more conventional side-scrolling games, the joystick is used not only to crouch, but also to jump. Punches and kicks can be performed from a standing, crouching or jumping position. Punches award more points than kicks and do more damage, but their range is shorter.

Underlings encountered by the player include Grippers, who can grab Thomas and drain his energy until shaken off; Knife Throwers, who can throw at two different heights and must be hit twice; and Tom Toms, short fighters who can either grab Thomas or somersault to strike his head when he is crouching. On even-numbered floors, the player must also deal with falling balls and pots, snakes, poisonous moths, fire-breathing dragons, and exploding confetti balls.

The temple has five floors, each ending with a different boss who are “sons of the devil” which are the Stick Fighter of the first floor, the Boomerang Fighter of the second floor, the Strongman of the third floor, the Black Magician of the fourth floor and Mr. X of the final floor who must all be defeated before Thomas can climb the stairs to the next floor so he can rescue Silvia. Thomas must complete each floor within a fixed time; if time runs out or his energy is completely drained, he loses one life and must replay the entire floor. If a boss defeats Thomas, the boss laughs. Although there are five bosses, the game only uses two different synthesized laughs. (The NES version uses a third, high-pitched synthesized laugh for the Black Magician, the fourth boss.)

Once the player has completed all five floors, the game restarts with a more demanding version of the Devil’s Temple, although the essential details remain unchanged. A visual indication of the current house is displayed on the screen. For each series of five completed floors, a dragon symbol appears in the upper-right corner of the screen. After three dragons have been added, the dragon symbols blink.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
72068.0 ApeX DQN DQN Distributed Prioritized Experience Replay
40835.0 A3C LSTM PG Asynchronous Methods for Deep Learning
37484.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
33890.0 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
31676.0 PERDDQN (rank) DQN Prioritized Experience Replay
31244.0 PERDDQN (prop) DQN Prioritized Experience Replay
30207.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
28999.8 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
28819.0 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
27921.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
24288.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
20882.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
20786.8 Human Human Massively Parallel Methods for Deep Reinforcement Learning
20620.0 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
20181.0 PERDQN (rank) DQN Prioritized Experience Replay
11875.0 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning
3046.0 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
304.0 Random Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
97829.5 ApeX DQN DQN Distributed Prioritized Experience Replay
55790.0 NoisyNet-A3C PG Noisy Networks for Exploration
52181.0 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
48375.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
48192.0 C51 Misc A Distributional Perspective on Reinforcement Learning
43470.0 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
43009.0 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
41672.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
39581.0 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
39581.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
37422.0 A3C PG Noisy Networks for Exploration
36310.0 NoisyNet-DQN DQN Noisy Networks for Exploration
34954.0 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
34393.0 DDQN+PopArt DQN Learning values across many orders of magnitude
34294.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
34099.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
32212.3 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
30444.0 DQN DQN Noisy Networks for Exploration
30316.0 DuelingDQN DQN Noisy Networks for Exploration
29710.0 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
29486.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
29151 Contingency Misc Human-level control through deep reinforcement learning
29132.0 DQfD Imitation Deep Q-Learning from Demonstrations
27543.33 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
26059.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
23270 DQN2015 DQN Human-level control through deep reinforcement learning
22736 Human Human Human-level control through deep reinforcement learning
19544 Linear Misc Human-level control through deep reinforcement learning
258.5 Random Random Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
27599.3 ACER PG Proximal Policy Optimization Algorithms
24900.3 A2C PG Proximal Policy Optimization Algorithms
23310.3 PPO PG Proximal Policy Optimization Algorithms