Overview

The galaxy of Solaris is made up of 16 quadrants, each containing 48 sectors. The player uses a tactical map to choose a sector to warp to, during which they must attempt to keep their ship “in focus” to lower their fuel consumption rate. Fuel must be carefully managed, as an empty tank results in loss of one of the player lives. Space battle ensues whenever the player navigates into a hostile battlegroup via the tactical map. Space enemies include pirate ships, mechanoid ships, and aggressive “cobra” ships. Each battlegroup has at least one enemy flagship, which shoots out fuel-sapping drones.

The player may also descend to one of 3 types of planets.

  1. Friendly federation planets, which provide for refueling, but may also harbor a planet defense mission if they are under attack. If players allow a friendly planet in a quadrant to be destroyed, that quadrant becomes a “red zone” where joystick controls are reversed and booming sounds are overheard.
  2. Enemy Zylon planets, in which the player must rescue all cadets, gaining an extra ship when all cadets are rescued.
  3. Enemy corridor planets, in which the player must traverse through a fast-paced corridor.

There are 4 kinds of ground enemies found on planets: stationary guardians, gliders, targeters and raiders. The ultimate goal of Solaris is to reach the planet Solaris and rescue its colonists, at which point the game ends in victory.

Description from Wikipedia

State of the Art

Human Starts

Result Method Type Score from
11032.6 Human Human Deep Reinforcement Learning with Double Q-learning
3115.9 ApeX DQN DQN Distributed Prioritized Experience Replay
2860.7 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2608.2 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2530.2 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
2272.8 PERDDQN (rank) DQN Prioritized Experience Replay
2238.2 PERDDQN (prop) DQN Prioritized Experience Replay
2047.2 Random Random Deep Reinforcement Learning with Double Q-learning
1956.0 A3C FF (4 days) PG Asynchronous Methods for Deep Learning
1936.4 A3C LSTM PG Asynchronous Methods for Deep Learning
1884.8 A3C FF (1 day) PG Asynchronous Methods for Deep Learning
1768.4 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1295.4 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
810.0 DDQN DQN Deep Reinforcement Learning with Double Q-learning
280.6 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
134.6 PERDQN (rank) DQN Prioritized Experience Replay
-9001.0 DQN2015 DQN Asynchronous Methods for Deep Learning
-9001.0 GorilaDQN DQN Asynchronous Methods for Deep Learning

No-op Starts

Result Method Type Score from
12380.0 A3C PG Noisy Networks for Exploration
12326.7 Human Human Dueling Network Architectures for Deep Reinforcement Learning
10427.0 NoisyNet-A3C PG Noisy Networks for Exploration
8342.0 C51 Misc A Distributional Perspective on Reinforcement Learning
6522.0 NoisyNet-DuelingDQN DQN Noisy Networks for Exploration
6088.0 NoisyNet-DQN DQN Noisy Networks for Exploration
5643.1 DistributionalDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
4544.8 DDQN+PopArt DQN Learning values across many orders of magnitude
4309.0 PER DQN Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
4309.0 PERDDQN (rank) DQN Dueling Network Architectures for Deep Reinforcement Learning
4157.0 DuelingPERDDQN DQN Deep Q-Learning from Demonstrations
4055.0 DQN DQN Noisy Networks for Exploration
3560.3 RainbowDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3482.8 DQN2015 DQN Dueling Network Architectures for Deep Reinforcement Learning
3423.0 DuelingDQN DQN Noisy Networks for Exploration
3204.5 NoisyNetDQN DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
3067.8 DDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
2892.9 ApeX DQN DQN Distributed Prioritized Experience Replay
2616.8 DQfD Imitation Deep Q-Learning from Demonstrations
2368.6 ACKTR PG Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
2250.8 DuelingDDQN DQN Dueling Network Architectures for Deep Reinforcement Learning
1710.8 PERDDQN (prop) DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
1263.0 Random Random Noisy Networks for Exploration
1236.3 Random Random Dueling Network Architectures for Deep Reinforcement Learning
133.4 DuelingPERDQN DQN Dueling Network Architectures for Deep Reinforcement Learning