Reinforcement Learning and Artificial Intelligence (RLAI) 

Instructor: Patrick M. Pilarski (pilarski@ualberta.ca) (http://www.ualberta.ca/~pilarski)
Office: 5020F Katz Group Centre. Office Hours: After class, Monday and Wednesday, 3:204:00pm.
Class Time: Monday and Wednesday, 2:003:20pm in ECHA L1 472. Please note: L1 is the basement of the ECHA; our room is located in the blue region. First class Monday, January 5th.
Description: This course will provide a comprehensive introduction to reinforcement learning as an approach to artificial intelligence, emphasizing the design of complete agents interacting with stochastic, incompletely known environments. Reinforcement learning has adapted key ideas from machine learning, operations research, psychology, and neuroscience to produce some strikingly successful engineering applications. The focus is on algorithms for learning what actions to take, and when to take them, so as to optimize longterm performance. This may involve sacrificing immediate reward to obtain greater reward in the longterm or just to obtain more information about the environment. The course will cover Markov decision processes, dynamic programming, temporaldifference learning, Monte Carlo reinforcement learning methods, eligibility traces, the role of function approximation, and the integration of learning and planning. The course will emphasize the development of intuition relating the mathematical theory of reinforcement learning to the design of humanlevel artificial intelligence. Through a final written term paper, students will also develop and demonstrate an understanding of the practical application of modern reinforcement learning techniques to address realworld problems and domains in industry, medicine, and neurobiology.
Textbook: Reinforcement Learning: An Introduction, second edition in progress, by Richard S. Sutton and Andrew G. Barto. New version to be distributed in class.
Other Materials: Most materials will be distributed via the course dropbox folder.
Prerequisites: Interest in learning approaches to artificial intelligence; basic probability theory; computer programming ability. You should be comfortable with statistical ideas such as probability distributions and expected values. Familiarity with linear algebra would be helpful but is not required. Due to a limited class size, students should be using or intend to use reinforcement learning in their thesis research; permission of the instructor is mandatory prior to enrolment.
Written Exercises: There will be a set of exercises for most chapters. These will be due at the beginning of the second day on which the chapter is covered in class. All exercises will be marked and returned to you. Answer sheets for each week's exercises will be made available at the class on the day on which the exercises are due, so your exercises must be turned in on time.
Grading will be on the basis of (with relative weighting):
5  Written exercises (5)
4  Midterm exam
3  Programming projects (3)
4  Final written report
Academic Integrity: The University of Alberta is committed to the highest standards of academic integrity and honesty. Students are expected to be familiar with these standards regarding academic honesty and to uphold the policies of the University in this respect. Students are particularly urged to familiarize themselves with the provisions of the Code of Student Behaviour and avoid any behaviour which could potentially result in suspicions of cheating, plagiarism, misrepresentation of facts and/or participation in an offence. Academic dishonesty is a serious offence and can result in suspension or expulsion from the University.