Atari H.E.R.O. Environment

Overview

The player assumes control of Roderick Hero (sometimes styled as “R. Hero”), a one-man rescue team. Miners working in Mount Leone are trapped, and it’s up to Roderick to reach them.

The player is equipped with a backpack-mounted helicopter unit, which allows him to hover and fly, along with a helmet-mounted laser and a limited supply of dynamite. Each level consists of a maze of mine shafts that Roderick must safely navigate in order to reach the miner trapped at the bottom. The backpack has a limited amount of power, so the player must reach the miner before the power supply is exhausted.

Mine shafts may be blocked by cave-ins or magma, which require dynamite to clear. The helmet laser can also destroy cave-ins, but more slowly than dynamite. Unlike a cave-in, magma is lethal when touched. Later levels include walls of magma with openings that alternate between open and closed requiring skillful navigation. The mine shafts are populated by spiders, bats and other unknown creatures that are deadly to the touch; these creatures can be destroyed using the laser or dynamite.

Some deep mines are flooded, forcing players to hover safely above the water. In later levels, monsters strike out from below the water. Some mine sections are illuminated by lanterns. If the lantern is somehow destroyed, the layout of that section becomes invisible. Exploding dynamite lights up the mine for a brief time.

Points are scored for each cave-in cleared and each creature destroyed. When the player reaches the miner, points are awarded for the rescue, along with the amount of power remaining in the backpack and for each remaining stick of dynamite. Extra lives are awarded for every 20,000 points scored.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result Algorithm Source
50496.8 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
32464.1 A3C FF Asynchronous Methods for Deep Reinforcement Learning
28889.5 A3C LSTM Asynchronous Methods for Deep Reinforcement Learning
28765.8 A3C FF 1 day Asynchronous Methods for Deep Reinforcement Learning
28544.2 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
25839.4 Human Massively Parallel Methods for Deep Reinforcement Learning
20889.9 Prioritized DDQN (rank, tuned) Prioritized Experience Replay
20506.4 Prioritized DDQN (prop, tuned) Prioritized Experience Replay
15459.2 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
15341.4 DDQN Deep Reinforcement Learning with Double Q-learning
15207.9 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
15150.9 Prioritized DQN (rank) Prioritized Experience Replay
14892.5 DDQN (tuned) Deep Reinforcement Learning with Double Q-learning
12952.5 DQN Massively Parallel Methods for Deep Reinforcement Learning
8963.36 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
1580.3 Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Algorithm Source
55887.4 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
43360.4 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
38874 C51 A Distributional Perspective on Reinforcement Learning
35895 DuDQN Noisy Networks for Exploration
35542.2 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
33860.9 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
33853.15 IMPALA (shallow) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
33730.55 IMPALA (deep) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
31533 NoisyNet DuDQN Noisy Networks for Exploration
30826.4 Human Dueling Network Architectures for Deep Reinforcement Learning
30791 A3C Noisy Networks for Exploration
28386 IQN Implicit Quantile Networks for Distributional Reinforcement Learning
27833.0 Reactor ND The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
25762.5 Human Human-level control through deep reinforcement learning
21785 QR-DQN-0 Distributional Reinforcement Learning with Quantile Regression
21395 QR-DQN-1 Distributional Reinforcement Learning with Quantile Regression
21036.5 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
20818.2 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
20437.8 DQN A Distributional Perspective on Reinforcement Learning
20357.0 DDQN Deep Reinforcement Learning with Double Q-learning
20130.2 DDQN A Distributional Perspective on Reinforcement Learning
19950 DQN Human-level control through deep reinforcement learning
18818.9 IMPALA (deep, multitask) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
15176 DQN Noisy Networks for Exploration
14913.87 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
8471 NoisyNet A3C Noisy Networks for Exploration
7295 Contingency Human-level control through deep reinforcement learning
6459 Linear Human-level control through deep reinforcement learning
6246 NoisyNet DQN Noisy Networks for Exploration
1027.0 Random Human-level control through deep reinforcement learning

Normal Starts

| Result | Algorithm | Source | |——–|———–|——–|