Atari Double Dunk Environment

Overview

Double Dunk is a simulation of two-on-two, half-court basketball. Teams have two on-screen characters, a shorter “outside” man and a taller “inside” man. In a single-player game, the player controls the on-screen character closest to the ball, either the one holding the ball (on offense) or the one guarding the opponent with the ball (on defense). In two-player games, each player may control one of the two teams as in a one-player game, or both players may play on the same team against a computer-controlled opponent. At the start of each possession, both offense and defense select from a number of plays (such as the “pick and roll” on offense), then attempt to score or regain possession of the ball by intercepting or stealing it from the offense.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result Algorithm Source
16.0 Prioritized DDQN (rank, tuned) Prioritized Experience Replay
2.7 Prioritized DDQN (prop, tuned) Prioritized Experience Replay
0.1 A3C FF 1 day Asynchronous Methods for Deep Reinforcement Learning
0.1 A3C LSTM Asynchronous Methods for Deep Reinforcement Learning
-0.1 A3C FF Asynchronous Methods for Deep Reinforcement Learning
-0.3 DDQN (tuned) Deep Reinforcement Learning with Double Q-learning
-0.6 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
-0.8 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
-3.7 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
-5.3 Prioritized DQN (rank) Prioritized Experience Replay
-6.4 DDQN Deep Reinforcement Learning with Double Q-learning
-10.7 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
-11.35 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
-14.4 Human Massively Parallel Methods for Deep Reinforcement Learning
-16.0 Random Massively Parallel Methods for Deep Reinforcement Learning
-21.6 DQN Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Algorithm Source
23.0 Reactor ND The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
23.0 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
21.9 QR-DQN-1 Distributional Reinforcement Learning with Quantile Regression
17 DuDQN Noisy Networks for Exploration
12.3 QR-DQN-0 Distributional Reinforcement Learning with Quantile Regression
11.4 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
5.6 IQN Implicit Quantile Networks for Distributional Reinforcement Learning
3 A3C Noisy Networks for Exploration
3 NoisyNet A3C Noisy Networks for Exploration
2.5 C51 A Distributional Perspective on Reinforcement Learning
1 NoisyNet DQN Noisy Networks for Exploration
1 NoisyNet DuDQN Noisy Networks for Exploration
0.1 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
-0.3 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
-0.33 IMPALA (deep) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-0.35 IMPALA (shallow) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-0.54 ACKTR Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
-1.92 IMPALA (deep, multitask) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-3.8 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
-5.5 DDQN A Distributional Perspective on Reinforcement Learning
-6 DQN Noisy Networks for Exploration
-6.3 DDQN Deep Reinforcement Learning with Double Q-learning
-6.6 DQN A Distributional Perspective on Reinforcement Learning
-10.62 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
-12.5 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
-13.1 Linear Human-level control through deep reinforcement learning
-15.5 Human Human-level control through deep reinforcement learning
-16 Contingency Human-level control through deep reinforcement learning
-16.4 Human Dueling Network Architectures for Deep Reinforcement Learning
-18.1 DQN Human-level control through deep reinforcement learning
-18.6 Random Human-level control through deep reinforcement learning

Normal Starts

Result Algorithm Source
-2 DRQN Deep Recurrent Q-Learning for Partially Observable MDPs
-10 DQN Ours Deep Recurrent Q-Learning for Partially Observable MDPs
-13.2 ACER Proximal Policy Optimization Algorithm
-14 DRQN Deep Recurrent Q-Learning for Partially Observable MDPs
-14.9 PPO Proximal Policy Optimization Algorithm
-16.2 DQN Ours Deep Recurrent Q-Learning for Partially Observable MDPs
-16.2 A2C Proximal Policy Optimization Algorithm