Principled Methods for Advising Reinforcement Learning Agents

Free registration required

Executive Summary

An important issue in reinforcement learning is how to incorporate expert knowledge in a principled manner, especially as one scales up to real-world tasks. This paper presents a method for incorporating arbitrary advice into the reward structure of a reinforcement learning agent without altering the optimal policy. This method extends the potential-based shaping method proposed by Ng et al. (1999) to the case of shaping functions based on both states and actions. This allows for much more specific information to guide the agent { which action to choose { without requiring the agent to discover this from the rewards on states alone.

  • Format: PDF
  • Size: 174.6 KB