Document worth reading: “A Short Survey on Probabilistic Reinforcement Learning”

A reinforcement learning agent tries to maximise its cumulative payoff by interacting in an unknown ambiance. It is critical for the agent to find suboptimal actions along with to pick out actions with highest recognized rewards. Yet, in delicate domains, accumulating additional information with exploration is simply not on a regular basis potential, nonetheless you will want to find a protection with a positive effectivity guarantee. In this paper, we present a brief survey of methods obtainable throughout the literature for balancing exploration-exploitation commerce off and computing sturdy choices from fixed samples in reinforcement learning. A Short Survey on Probabilistic Reinforcement Learning