Reinforcement Learning
An Introduction to RL
Preface
While I am in now way an expert and honestly straight up inexperienced in reinforcement learning, I think a lot of the modern material can have poorly explained overviews on what is happening and leave people scratching their heads at the bigger picture even if they understand the smaller fine points. I wanted to create an article that helps alleviate this problem. The material in this article is heavily structured off of the class I took last semester at Cornell, Wen Sun’s 4789: Introduction to Reinforcement Learning
The thing we care about
At some level, all of reinforcement learning comes back to the idea of a Markov Decision Process, or MDP. Specifically we care about optimizing an MDP. A Markov decision process contains multiple things. Recall that
Definition:
Markov Decision Process Recall an MDP is defined as such \(MDP = \{A, S, \pi, mu, r\}\) Where $A$ is the action space, $S$ is the state space, $\pi$ is a policy, $\mu$ is the initial state, and $r$ is the reward function.
Now this being said, we actually need to create a distinction between
Definition:
Finite Horizon Markov Decision Process Hi
Definition:
Infinite Horization Markov Decision Process Hi
Convergence of RL Systems with Finite Policy Action Space
PSYCH you thought I would actually fill this out im so lazy lol