📜 Course Description

This course provides introduction to the field of reinforcement learning. The students will learn about the approaches and challenges of reinforcement learning, including generalization and exploration. It introduces statistical learning techniques where an agent explicitly takes actions and interacts with the world.

🏁 Pre-requisites for this class

Proficiency in Python. All class assignments will be in Python. There is a tutorial here for those who aren't as familiar with Python. If you have a lot of programming experience but in a different language (e.g. Javascript/Java) you will probably be fine.
College Calculus, Linear Algebra. You should be comfortable taking derivatives and understanding matrix vector operations and notation.
Basic Probability and Statistics. You should know basics of probabilities, Gaussian distributions, mean, standard deviation, etc.
Foundations of Machine Learning. We will be formulating cost functions, taking derivatives and performing optimization with gradient descent. Some optimization tricks will be more intuitive with some knowledge of convex optimization.

🚀 Learning Outcomes

By the end of the class students should be able to:

Define the key features of reinforcement learning that distinguishes it from AI and non-interactive machine learning.
Given an application problem (e.g. from computer vision, robotics, etc), decide if it should be formulated as a RL problem; if yes be able to define it formally (in terms of the state space, action space, dynamics and reward model), state what algorithm (from class) is best suited for addressing it and justify your answer.
Implement in code common RL algorithms.
Describe (list and define) multiple criteria for analyzing RL algorithms and evaluate algorithms on these metrics: e.g. regret, sample complexity, computational complexity, empirical performance, convergence, etc.
Describe the exploration vs exploitation challenge and compare and contrast at least two approaches for addressing this challenge (in terms of performance, scalability, complexity of implementation, and theoretical guarantees).

📅 Course Outline and Timeframe