# Observational Overfitting in Reinforcement Learning

Song et al., 2019 | https://arxiv.org/abs/1912.02975

• Observational overfitting: Agent overfits due to properties of the observation irrelevant to the latent dynamics of the MDP.
• Effect: This could hinder generalization.
• Evidence 1: Scoreboard and background objects is highlighted red in the saliency map.
• Evidence 2: Covering the scoreboard with a black rectangle during training resulted in a 10% increased test performance.
• Solution?: Overparametrizing can help as a form of “implicit regularization.”, improving generalization to test set.
<
>