Overview

The players’ characters, called Worriors, must kill all the monsters by shooting them. Player one has yellow Worriors, on the right, and player two has blue Worriors, on the left. In a two-player game, the players are also able to shoot each other’s Worriors, earning bonus points and causing the other player to lose a life. Team-oriented players can successfully advance through the game by standing back-to-back (such as in a corner) and firing at anything that comes at them.

Each dungeon consists of a single-screen rectangular grid with walls and corridors in various formations. The Worriors and the monsters can travel freely through the corridors. Each dungeon has doors at the left and right edges, which connect with each other, making the dungeon wrap around. Whenever a door is traversed by a player or monster, both of them deactivate for a short period, making them impassable. A player who exits the door can pop back through the door immediately when the Worluk or Wizard is in the dungeon. A small radar display indicates the positions of all active monsters.

As long as a player has at least one life in reserve, a backup Worrior is displayed in a small sealed cubbyhole at the corresponding bottom corner of the dungeon. When the current Worrior is killed, the cubbyhole opens and the player has 10 seconds to move the backup into play before automatically being forced in.

The various monsters include the following:

  • Burwor: A blue wolf-type creature.
  • Garwor: A yellow Tyrannosaurus rex-type creature.
  • Thorwor: A red scorpion-like creature.
  • Worluk: An Insectoid-type creature.
  • Wizard of Wor: A blue wizard. Both Garwors and Thorwors have the ability to turn invisible at times, but will always appear on the radar. All enemies except the Worluk can shoot at the Worriors.

Each dungeon starts filled with six Burwors. In the first dungeon, killing the last Burwor will make a Garwor appear; in the second, the last two Burwors are replaced by Garwors when killed; and so on. From the sixth dungeon on, a Garwor will replace every Burwor when killed. On every screen, killing a Garwor causes a Thorwor to appear. There will never be more than six enemies on the screen at once. From the second dungeon on, after the last Thorwor is killed, a Worluk will appear and try to escape through one of the side doors. The level ends when the Worluk either escapes or is killed; in the latter case, all point values for the next dungeon are doubled.

The Wizard of Wor will appear in or after the second dungeon once the Worluk has either escaped or been killed. After a few seconds the Wizard will disappear and teleport across the dungeon, gradually approaching a Worrior. The Wizard remains in the dungeon until he shoots a Worrior or is killed. He uses a speech synthesizer to taunt the players throughout the game.

Players are referred to as “Worriors” during the first seven levels, then as “Worlords” beyond that point. The “Worlord Dungeons” are more difficult than the earlier levels because they have fewer interior walls.

There are two special dungeons with increased difficulty. Level 4 is “The Arena,” with a large open area in its center, and Level 13 is “The Pit,” with no interior walls at all. A bonus Worrior is awarded before each of these levels. Every sixth dungeon after Level 13 is another Pit. A player who survives any Pit level without losing a life earns the title of “Worlord Supreme.”

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
46897.0 ApeX DQN DQN Distributed Prioritized Experience Replay
18082.0 A3C LSTM PG Asynchronous Methods for Deep Learning
17244.0 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
14631.5 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
11824.5 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
10471.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
10431.0 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
7451.0 PERDDQN (prop) DQN Prioritized Experience Replay
7054.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
6201.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
5727.0 PERDDQN (rank) DQN Prioritized Experience Replay
5278.0 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
4796.5 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
4556.0 Human Human Massively Parallel Methods for Deep Reinforcement Learning
2755.0 PERDQN (rank) DQN Prioritized Experience Replay
1609.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
804.0 Random Random Massively Parallel Methods for Deep Reinforcement Learning
246.0 DQN2015 DQN Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Method Type Score from
46204.0 ApeX DQN DQN Distributed Prioritized Experience Replay
17862.5 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
15994.5 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
13731.33 GorilaDQN DQN Massively Parallel Methods for Deep Reinforcement Learning
12723.0 NoisyNet-A3C PG Noisy Networks for Exploration
12352.0 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
10373.0 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
9300.0 C51 Misc A Distributional Perspective on Reinforcement Learning
9198.0 NoisyNet-DQN DQN Noisy Networks for Exploration
9149.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
8953.0 A3C PG Noisy Networks for Exploration
7855.0 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
7492.0 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
6534.0 DuelingDQN DQN Noisy Networks for Exploration
5432.0 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
5204.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
4802.0 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
4802.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
4757 Human Human Human-level control through deep reinforcement learning
4756.5 Human Human Learning values across many orders of magnitude
3601.0 DQN DQN Noisy Networks for Exploration
3393 DQN2015 DQN Human-level control through deep reinforcement learning
2704.0 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
1981 Linear Misc Human-level control through deep reinforcement learning
702.0 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
563.5 Random Random Human-level control through deep reinforcement learning
483.0 DDQN+PopArt DQN Learning values across many orders of magnitude
36.9 Contingency Misc Human-level control through deep reinforcement learning

Normal Starts

Result Method Type Score from
4185.3 PPO PG Proximal Policy Optimization Algorithms
2308.3 ACER PG Proximal Policy Optimization Algorithms
859.0 A2C PG Proximal Policy Optimization Algorithms