RL Weekly is a weekly newsletter highlighting important progress in reinforcement learning in research or industry.
RL Weekly 12: Atari Demos with Human Gaze Labels, New SOTA in Meta-RL, and a Hierarchical Take on Intrinsic Rewards
This week, we look at a new demo dataset of Atari games that include trajectories and human gaze. We also look at PEARL, a new meta-RL method that boasts sample efficiency and performance superior to previous state-of-the-art algorithms. Finally, we look at a novel method of incorporating intrinsic rewards.
RL Weekly 11: The Bitter Lesson by Richard Sutton, the Promise of Hierarchical RL, and Exploration with Human Feedback
In this issue, we first look at a diary entry by Richard S. Sutton (DeepMind, UAlberta) on Compute versus Clever. Then, we look at a post summarizing Hierarchical RL by Yannis Flet-Berliac (INRIA SequeL). Finally, we summarize a paper incorporating human feedback for exploration from Delft University of Technology.
RL Weekly 10: Learning from Playing, Understanding Multi-agent Intelligence, and Navigating in Google Street View
In this issue, we look at Google Brain's algorithm of learning by playing, DeepMind's thoughts on multi-agent intelligence, and DeepMind's new navigation environment using Google Street View data.
RL Weekly 9: Sample-efficient Near-SOTA Model-based RL, Neural MMO, and Bottlenecks in Deep Q-Learning
In this issue, we look at SimPLe, a model-based RL algorithm that achieves near-state-of-the-art results on Arcade Learning Environments (ALE). We also look at Neural MMO, a new multiagent environment by OpenAI, and an empirical analysis of possible sources of error in deep Q-learning by BAIR.
RL Weekly 8: World Discovery Models, MuJoCo Soccer Environment, and Deep Planning Network
In this issue, we introduce World Discovery Models and MuJoCo Soccer Environment from Google DeepMind, and PlaNet from Google.
RL Weekly 7: Obstacle Tower Challenge, Hanabi Learning Environment, and Spinning Up Workshop
This week, we introduce the Obstacle Tower Challenge, a new RL competition by Unity, Hanabi Learning Environment, a multi-agent environment by DeepMind, and Spinning Up Workshop, a workshop hosted by OpenAI.
RL Weekly 6: AlphaStar, Rectified Nash Response, and Causal Reasoning with Meta RL
This week, we look at AlphaStar, a Starcraft II AI, PSRO_rN, an evaluation algorithm encouraging diverse population of well-trained agents, and a novel Meta-RL approach for causal reasoning. All three results are from DeepMind.
RL Weekly 5: Robust Control of Legged Robots, Compiler Phase-Ordering, and Go Explore on Sonic the Hedgehog
This week, we look at impressive robust control of legged robots by ETH Zurich and Intel, compiler phase-ordering by UC Berkeley and MIT, and a partial implementation of Uber's Go Explore.
RL Weekly 4: Generating Problems with Solutions, Optical Flow with RL, and Model-free Planning
In this issue, we introduce new curriculum learning algorithm by Uber AI Labs, model-free planning algorithm by DeepMind, and optical-flow based control algorithm by Intel Labs and University of Freiburg.
RL Weekly 3: Learning to Drive through Dense Traffic, Learning to Walk, and Summarizing Progress in Sim-to-Real
In this issue, we introduce the DeepTraffic competition from Lex Fridman's MIT Deep Learning for Self-Driving Cars course. We also review a new paper on using SAC to control a four-legged robot, and introduce a website summarizing progress in sim-to-real algorithms.
RL Weekly 2: Tuning AlphaGo, Macro-strategy for MOBA, Sim-to-Real with conditional GANs
In this issue, we discuss hyperparameter tuning for AlphaGo from DeepMind, Hierarchical RL model for a MOBA game from Tencent, and GAN-based Sim-to-Real algorithm from X, Google Brain, and DeepMind.
RL Weekly 1: Soft Actor-Critic Code Release; Text-based RL Competition; Learning with Training Wheels
In this inaugural issue of the RL Weekly newsletter, we discuss Soft Actor-Critic (SAC) from BAIR, the new TextWorld competition by Microsoft Research, and AsDDPG from University of Oxford and Heriot-Watt University.