Studying Artificial Intelligence, from backbone to application.
RL Weekly 38: Clipped objective is not why PPO works, and the Trap of Saliency maps
In this issue, we look at the effect of PPO's code-level optimizations and the study of saliency maps in RL.
22 Dec 2019
RL Weekly 37: Observational Overfitting, Hindsight Credit Assignment, and Procedurally Generated Environment Suite
In this issue, we look at Google and MIT's study on the observational overfitting phenomenon and how overparametrization helps generalization, a new family of algorithms...
11 Dec 2019
RL Weekly 36: AlphaZero with a Learned Model achieves SotA in Atari
In this issue, we look at MuZero, DeepMind's new algorithm that learns a model and achieves AlphaZero performance in Chess, Shogi, and Go and achieves...
28 Nov 2019
RL Weekly 35: Escaping Local Optimas in Distance-based Rewards and Choosing the Best Teacher
In this issue, we look at an algorithm that use sibling trajectories to escape local optimas in distance-based shaped rewards, and an algorithm that dynamically...
21 Nov 2019
RL Weekly 34: Dexterous Manipulation of the Rubik's Cube and Human-Agent Collaboration in Overcooked
In this issue, we look at a robot hand manipulating and "solving" the Rubik's Cube. We also look at comparative performances of human-agnostic and human-aware...
28 Oct 2019
RL Weekly 33: Action Grammar, the Squashing Exploration Problem, and Task-relevant GAIL
In this issue, we look at Action Grammar RL, a hierarchical RL framework that adds new macro-actions, improving performance of DDQN and SAC in Atari...
14 Oct 2019
Never miss an issue of
from us, subscribe to our newsletter